If I say you wanted to do something with hand gesture recognition in python, what would be the first solution that will pop in your mind: train a CNN, contours, or convexity hull. Sounds good and feasible, but when it comes to actually make use of these techniques, the detection is not very good and requires special conditions(like a proper background or similar conditions that you used while training).

https://medium.com/analytics-vidhya/mediapipe-hand-gesture-based-volume-controller-in-python-w-o-gpu-67db1f30c6ed