In regular HTTP requests the connections between client and server are short-lived, a client connects to the server, sends a request, receives the response and then closes the connection. In this model the server can serve a large number of clients using a small number of workers. The concurrency model in this situation is typically based on threads, processes or a combination of both.
When you use websocket the problem is more complex, because a websocket connection is open for a long period of time, so the server cannot use a small pool of workers to serve a large number of clients, each client needs to get its own dedicated worker. If you use threads and/or processes then your app will not scale to support a large number of clients because you can’t have large number of threads/processes.
This is where gevent enters the picture. Gevent has a concurrency model based on greenlets, which scale much better than threads/processes. So serving websocket connections with a gevent based server allows you support more clients, due to the lightweight nature of greenlets. With uWSGI you have a choice of concurrency models to use with web sockets, and that includes the greenlet based model from gevent. You can also use gevent’s web server standalone if you want.
But note that gevent does not know anything about web sockets, it is just a server. To use websocket connections you have to add an implementation of the websocket server.
There are two extensions for Flask that simplify the use of websockets. The Flask-Sockets extension by Kenneth Reitz is a wrapper for gevent and gevent-websocket. The Flask-SocketIO extension (shameless plug as I’m the author) is a wrapper for gevent and gevent-socketio on the server, plus Socket.IO on the client. Socket.IO is higher level socket protocol that can use web socket if available but can also use other transport mechanisms on older browsers.
I hope this helps!