To design a robust, high-availability system that handles 100+ concurrent VLM requests with low latency. This person is the "bridge" between the model and the production environment.
To improve the "intelligence" and domain-specificity of your model. This person focuses on model weights, the data it consumes, and keeping you ahead of the research curve.
To improve the "intelligence" and domain-specificity of your model. This person focuses on model weights, the data it consumes, and keeping you ahead of the research curve.
Reduce the cost and latency of model outputs. This person ensures the model doesn't just work, but runs fast enough for 100 people to chat simultaneously without lag.
Manage the "GPU Fleet." This role ensures that the website operations are stable and that the infrastructure scales up when traffic hits and scales down to save money.
To build a high-performance, responsive interface that masks the latency of VLM inference and provides a seamless "pro-grade" chat experience.
Ensure a "ChatGPT-like" smooth experience. This person bridges the gap between the heavy backend model and the user's browser, handling large image uploads and streaming text.
At Mantra Softech, we work hand in hand towards one goal to Make customers happy with the best service. Our objectives are clear and we give our team the best tools to help them achieve that goal. Whatever role they play, we motivate employees to make a difference for our customers, our team, and ourselves.
We use essential and functional cookies on our website to provide you a more customized digital experience. To learn more about how we use cookies and how you can change your cookie settings, kindly refer to our Privacy Statement. If you are fine to resume in light of the above, please click on 'I Accept'.