lane detection optimized with gpu coder -pg电子麻将胡了
this example shows how to develop a deep learning lane detection application that runs on nvidia® gpus.
the pretrained lane detection network can detect and output lane marker boundaries from an image and is based on the alexnet network. the last few layers of the alexnet network are replaced by a smaller fully connected layer and regression output layer. the example generates a cuda executable that runs on a cuda-enabled gpu on the host machine.
prerequisites
cuda enabled nvidia gpu.
nvidia cuda toolkit and driver.
nvidia cudnn library.
environment variables for the compilers and libraries. for information on the supported versions of the compilers and libraries, see third-party hardware. for setting up the environment variables, see .
verify gpu environment
use the function to verify that the compilers and libraries necessary for running this example are set up correctly.
envcfg = coder.gpuenvconfig('host'); envcfg.deeplibtarget = 'cudnn'; envcfg.deepcodegen = 1; envcfg.quiet = 1; coder.checkgpuinstall(envcfg);
get pretrained lane detection network
this example uses the trainedlanenet mat-file containing the pretrained lane detection network. this file is approximately 143 mb size. download the file from the mathworks website.
lanenetfile = matlab.internal.examples.downloadsupportfile('gpucoder/cnn_models/lane_detection', ... 'trainedlanenet.mat');
this network takes an image as an input and outputs two lane boundaries that correspond to the left and right lanes of the ego vehicle. each lane boundary is represented by the parabolic equation: , where y is the lateral offset and x is the longitudinal distance from the vehicle. the network outputs the three parameters a, b, and c per lane. the network architecture is similar to alexnet except that the last few layers are replaced by a smaller fully connected layer and regression output layer.
load(lanenetfile); disp(lanenet)
seriesnetwork with properties:
layers: [23×1 nnet.cnn.layer.layer]
inputnames: {'data'}
outputnames: {'output'}
to view the network architecture, use the analyzenetwork function.
analyzenetwork(lanenet)
download test video
to test the model, the example uses the a video file from the caltech lanes dataset. the file is approximately 8 mb in size. download the file from the mathworks website.
videofile = matlab.internal.examples.downloadsupportfile('gpucoder/media','caltech_cordova1.avi');
main entry-point function
the detectlanesinvideo.m file is the main entry-point function for code generation. the detectlanesinvideo function uses the (computer vision toolbox) system object to read frames from the input video, calls the predict method of the lanenet network object, and draws the detected lanes on the input video. a (computer vision toolbox) system object is used to display the lane detected video output.
type detectlanesinvideo.mfunction detectlanesinvideo(videofile,net,lanecoeffmeans,lanecoeffsstds)
% detectlanesinvideo entry-point function for the lane detection optimized
% with gpu coder example
%
% detectlanesinvideo(videofile,net,lanecoeffmeans,lanecoeffsstds) uses the
% videofilereader system object to read frames from the input video, calls
% the predict method of the lanenet network object, and draws the detected
% lanes on the input video. a deployablevideoplayer system object is used
% to display the lane detected video output.
% pg电子麻将胡了 copyright 2022 the mathworks, inc.
%#codegen
%% create video reader and video player object
videofreader = vision.videofilereader(videofile);
depvideoplayer = vision.deployablevideoplayer(name='lane detection on gpu');
%% video frame processing loop
while ~isdone(videofreader)
videoframe = videofreader();
scaledframe = 255.*(imresize(videoframe,[227 227]));
[lanefound,ltpts,rtpts] = lanenetpredict(net,scaledframe, ...
lanecoeffmeans,lanecoeffsstds);
if(lanefound)
pts = [reshape(ltpts',1,[]);reshape(rtpts',1,[])];
videoframe = insertshape(videoframe, 'line', pts, 'linewidth', 4);
end
depvideoplayer(videoframe);
end
end
lanenet predict function
the lanenetpredict function computes the right and left lane positions in a single video frame. the lanenet network computes parameters a, b, and c that describe the parabolic equation for the left and right lane boundaries. from these parameters, compute the x and y coordinates corresponding to the lane positions. the coordinates must be mapped to image coordinates.
type lanenetpredict.mfunction [lanefound,ltpts,rtpts] = lanenetpredict(net,frame,means,stds)
% lanenetpredict predict lane markers on the input image frame using the
% lane detection network
%
% pg电子麻将胡了 copyright 2017-2022 the mathworks, inc.
%#codegen
% a persistent object lanenet is used to load the network object. at the
% first call to this function, the persistent object is constructed and
% setup. when the function is called subsequent times, the same object is
% reused to call predict on inputs, thus avoiding reconstructing and
% reloading the network object.
persistent lanenet;
if isempty(lanenet)
lanenet = coder.loaddeeplearningnetwork(net, 'lanenet');
end
lanecoeffsnetworkoutput = predict(lanenet,frame);
% recover original coeffs by reversing the normalization steps.
params = lanecoeffsnetworkoutput .* stds means;
% 'c' should be more than 0.5 for it to be a lane.
isrightlanefound = abs(params(6)) > 0.5;
isleftlanefound = abs(params(3)) > 0.5;
% from the networks output, compute left and right lane points in the image
% coordinates.
vehiclexpoints = 3:30;
ltpts = coder.nullcopy(zeros(28,2,'single'));
rtpts = coder.nullcopy(zeros(28,2,'single'));
if isrightlanefound && isleftlanefound
rtboundary = params(4:6);
rt_y = computeboundarymodel(rtboundary, vehiclexpoints);
ltboundary = params(1:3);
lt_y = computeboundarymodel(ltboundary, vehiclexpoints);
% visualize lane boundaries of the ego vehicle.
tform = get_tformtoimage;
% map vehicle to image coordinates.
ltpts = tform.transformpointsinverse([vehiclexpoints', lt_y']);
rtpts = tform.transformpointsinverse([vehiclexpoints', rt_y']);
lanefound = true;
else
lanefound = false;
end
end
%% helper functions
% compute boundary model.
function yworld = computeboundarymodel(model, xworld)
yworld = polyval(model, xworld);
end
% compute extrinsics.
function tform = get_tformtoimage
%the camera coordinates are described by the caltech mono
% camera model.
yaw = 0;
pitch = 14; % pitch of the camera in degrees
roll = 0;
translation = translationvector(yaw, pitch, roll);
rotation = rotationmatrix(yaw, pitch, roll);
% construct a camera matrix.
focallength = [309.4362, 344.2161];
principalpoint = [318.9034, 257.5352];
skew = 0;
cammatrix = [rotation; translation] * intrinsicmatrix(focallength, ...
skew, principalpoint);
% turn cammatrix into 2-d homography.
tform2d = [cammatrix(1,:); cammatrix(2,:); cammatrix(4,:)]; % drop z
tform = projective2d(tform2d);
tform = tform.invert();
end
% translate to image co-ordinates.
function translation = translationvector(yaw, pitch, roll)
sensorlocation = [0 0];
height = 2.1798; % mounting height in meters from the ground
rotationmatrix = (...
rotz(yaw)*... % last rotation
rotx(90-pitch)*...
rotz(roll)... % first rotation
);
% adjust for the sensorlocation by adding a translation.
sl = sensorlocation;
translationinworldunits = [sl(2), sl(1), height];
translation = translationinworldunits*rotationmatrix;
end
% rotation around x-axis.
function r = rotx(a)
a = deg2rad(a);
r = [...
1 0 0;
0 cos(a) -sin(a);
0 sin(a) cos(a)];
end
% rotation around y-axis.
function r = roty(a)
a = deg2rad(a);
r = [...
cos(a) 0 sin(a);
0 1 0;
-sin(a) 0 cos(a)];
end
% rotation around z-axis.
function r = rotz(a)
a = deg2rad(a);
r = [...
cos(a) -sin(a) 0;
sin(a) cos(a) 0;
0 0 1];
end
% given the yaw, pitch, and roll, determine the appropriate euler angles
% and the sequence in which they are applied to align the camera's
% coordinate system with the vehicle coordinate system. the resulting
% matrix is a rotation matrix that together with the translation vector
% defines the extrinsic parameters of the camera.
function rotation = rotationmatrix(yaw, pitch, roll)
rotation = (...
roty(180)*... % last rotation: point z up
rotz(-90)*... % x-y swap
rotz(yaw)*... % point the camera forward
rotx(90-pitch)*... % "un-pitch"
rotz(roll)... % 1st rotation: "un-roll"
);
end
% intrinsic matrix computation.
function intrinsicmat = intrinsicmatrix(focallength, skew, principalpoint)
intrinsicmat = ...
[focallength(1) , 0 , 0; ...
skew , focallength(2) , 0; ...
principalpoint(1), principalpoint(2), 1];
end
generate cuda executable
to generate a standalone cuda executable for the detectlanesinvideo entry-point function, create a gpu code configuration object for 'exe' target and set the target language to c . use the function to create a cudnn deep learning configuration object and assign it to the deeplearningconfig property of the gpu code configuration object.
cfg = coder.gpuconfig('exe'); cfg.deeplearningconfig = coder.deeplearningconfig('cudnn'); cfg.generatereport = true; cfg.generateexamplemain = "generatecodeandcompile"; cfg.targetlang = 'c '; inputs = {coder.constant(videofile),coder.constant(lanenetfile), ... coder.constant(lanecoeffmeans),coder.constant(lanecoeffsstds)};
run the codegen command.
codegen -args inputs -config cfg detectlanesinvideo
code generation successful: view report
generated code description
the series network is generated as a c class containing an array of 18 layer classes (after layer fusion optimization). the setup() method of the class sets up handles and allocates memory for each layer object. the predict() method invokes prediction for each of the 18 layers in the network.
class lanenet0_0 { public: lanenet0_0(); void setsize(); void resetstate(); void setup(); void predict(); void cleanup(); float *getlayeroutput(int layerindex, int portindex); int getlayeroutputsize(int layerindex, int portindex); float *getinputdatapointer(int b_index); float *getinputdatapointer(); float *getoutputdatapointer(int b_index); float *getoutputdatapointer(); int getbatchsize(); ~lanenet0_0(); private: void allocate(); void postsetup(); void deallocate(); public: boolean_t isinitialized; boolean_t matlabcodegenisdeleted; private: int numlayers; mwtensorbase *inputtensors[1]; mwtensorbase *outputtensors[1]; mwcnnlayer *layers[18]; mwcudnntarget::mwtargetnetworkimpl *targetimpl; };
the cnn_lanenet*_conv*_w and cnn_lanenet*_conv*_b files are the binary weights and bias file for convolution layer in the network. the cnn_lanenet*_fc*_w and cnn_lanenet*_fc*_b files are the binary weights and bias file for fully connected layer in the network.
codegendir = fullfile('codegen', 'exe', 'detectlanesinvideo'); dir([codegendir,filesep,'*.bin'])
cnn_lanenet0_0_conv1_b.bin cnn_lanenet0_0_conv3_b.bin cnn_lanenet0_0_conv5_b.bin cnn_lanenet0_0_fc6_b.bin cnn_lanenet0_0_fclane2_b.bin cnn_lanenet0_0_conv1_w.bin cnn_lanenet0_0_conv3_w.bin cnn_lanenet0_0_conv5_w.bin cnn_lanenet0_0_fc6_w.bin cnn_lanenet0_0_fclane2_w.bin cnn_lanenet0_0_conv2_b.bin cnn_lanenet0_0_conv4_b.bin cnn_lanenet0_0_data_offset.bin cnn_lanenet0_0_fclane1_b.bin networkparamsinfo_lanenet0_0.bin cnn_lanenet0_0_conv2_w.bin cnn_lanenet0_0_conv4_w.bin cnn_lanenet0_0_data_scale.bin cnn_lanenet0_0_fclane1_w.bin
run the executable
to run the executable, uncomment the following lines of code.
if ispc [status,cmdout] = system("detectlanesinvideo.exe"); else [status,cmdout] = system("./detectlanesinvideo"); end

see also
functions
codegen| | |
objects
- | | |
related topics
- supported networks, layers, and classes
- deep learning in matlab (deep learning toolbox)