pretty big speed boost on 4090 at least, needs this installed: https://github.com/aredden/torch-cublas-hgemm