For those who want a hand understanding of whats going on in this code, let's do a quick run-through of it and see whats going on under the hood. This is not essential to know what's going on, but just an extra for those that want it.
As always, our code starts by importing the required libraries including OpenCV and the face-recognition library we installed earlier:
import face_recognitionimport cv2import numpy as npfrom picamera2 import Picamera2import timeimport pickle
Then we load the pickle model we created and unpack it into faces and names.
print("[INFO] loading encodings...")with open("encodings.pickle", "rb") as f: data = pickle.loads(f.read())known_face_encodings = data["encodings"]known_face_names = data["names"]
Using Picamera 2, we then initialise the camera with the specified resolution:
picam2 = Picamera2()picam2.configure(picam2.create_preview_configuration(main={"format": 'XRGB8888', "size": (1920, 1080)}))picam2.start()
And then we initialise a whole heap of variables that we will use through the rest of the code. Here is where we can change our cv_scaler:
cv_scaler = 4 # face_locations = []face_encodings = []face_names = []frame_count = 0start_time = time.time()fps = 0
Next, we create our first function which takes in a frame and gets the recognition data out of it. It starts by using cv_scaler to scale down the frame we feed into it to a lower resolution. Then it converts it from BGR to RGB as needed by the library:
def process_frame(frame): global face_locations, face_encodings, face_names # Resize the frame using cv_scaler to increase performance (less pixels processed, less time spent) resized_frame = cv2.resize(frame, (0, 0), fx=(1/cv_scaler), fy=(1/cv_scaler)) # Convert the image from BGR to RGB colour space, the facial recognition library uses RGB, OpenCV uses BGR rgb_resized_frame = cv2.cvtColor(resized_frame, cv2.COLOR_BGR2RGB)
Then we feed the resized frame into the facial recognition library and get out the location and encodings of the face.
face_locations = face_recognition.face_locations(rgb_resized_frame) face_encodings = face_recognition.face_encodings(rgb_resized_frame, face_locations, model='large')
After that, we use a for loop to go through all the faces in the image and see if the encodings match any from our trained model.
face_names = [] for face_encoding in face_encodings: # See if the face is a match for the known face(s) matches = face_recognition.compare_faces(known_face_encodings, face_encoding) name = "Unknown" # Use the known face with the smallest distance to the new face face_distances = face_recognition.face_distance(known_face_encodings, face_encoding) best_match_index = np.argmin(face_distances) if matches[best_match_index]: name = known_face_names[best_match_index] face_names.append(name) return frame
We then create another function that will take in the frame and draw the boxes around it, as well as label it with the identified name. The for loop means that it cycles through all the identified faces. It uses the top, right, bottom, and left coordinates (which we found with the face_locations in the last function), to draw a box around the recognised face. But before we can use it, we must scale it by cv_scaler or it won't draw it in the right position on our camera preview (the processing is done on a down-sized frame so we have a to up-size the coordinates to match our camera preview):
def draw_results(frame): # Display the results for (top, right, bottom, left), name in zip(face_locations, face_names): # Scale back up face locations since the frame we detected in was scaled top *= cv_scaler right *= cv_scaler bottom *= cv_scaler left *= cv_scaler
Then we use the tools provided by OpenCV to actually draw these things. We first draw an empty rectangle around the face in blue with a line thickness of 3, then we draw a solid rectangle ontop of that box, and finally place the name of it ontop of that solid box.
# Draw a box around the face cv2.rectangle(frame, (left, top), (right, bottom), (244, 42, 3), 3) # Draw a label with a name below the face cv2.rectangle(frame, (left -3, top - 35), (right+3, top), (244, 42, 3), cv2.FILLED) font = cv2.FONT_HERSHEY_DUPLEX cv2.putText(frame, name, (left + 6, top - 6), font, 1.0, (255, 255, 255), 1) return frame
And then we create our final function that calculates our FPS. At the very end of it, it stores the current time in a variable, and the next time we call it, it will compare that time to the new time to figure out how long it has been, and calculates FPS off that.
# Draw a box around the face cv2.rectangle(frame, (left, top), (right, bottom), (244, 42, 3), 3) # Draw a label with a name below the face cv2.rectangle(frame, (left -3, top - 35), (right+3, top), (244, 42, 3), cv2.FILLED) font = cv2.FONT_HERSHEY_DUPLEX cv2.putText(frame, name, (left + 6, top - 6), font, 1.0, (255, 255, 255), 1) return frame
With our functions layed out, we are finally ready to get into our infinitely repeating while true loop. This starts by capturing a frame from the camera with Picamera2, then it feeds that frame into our process_frame function which spits out all the identified faces and their locations. Then we feed that into the display frame which takes the face locations and identified faces, and draws them on a frame. This display_frame variable is what we will be telling the Pi to show us in a bit:
while True: # Capture a frame from camera frame = picam2.capture_array() # Process the frame with the function processed_frame = process_frame(frame) # Get the text and boxes to be drawn based on the processed frame display_frame = draw_results(processed_frame)
After that, we call the calculate_fps function which calculates our FPS, we then use some more OpenCV tools to attatch the FPS counter to the display frame, and then we show it in the preview window!
# Calculate and update FPS current_fps = calculate_fps() # Attach FPS counter to the text and boxes cv2.putText(display_frame, f"FPS: {current_fps:.1f}", (display_frame.shape[1] - 150, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2) # Display everything over the video feed. cv2.imshow('Video', display_frame)
And finally at the end of the loop we check if the "q" key has been pressed. If it has, we will exit this while true loop and run the last bit of code which safely stops the camera and closes the camera preview window:
# Break the loop and stop the script if 'q' is pressed if cv2.waitKey(1) == ord("q"): break# By breaking the loop we run this code here which closes everythingcv2.destroyAllWindows()picam2.stop()