this article is solution of this assignment Alchemyst-ai
We are given an AI system which we can run a single system in a single process?
User request -> model -> response
But what problems we can have?
- What if system becomes overloaded?
- What if we wanna scale it?
- What if we wanna run 100x larger model?
- What if different parts of code is written in different languages?
So, instead of having everything in a single system we will split responsibilies.
- For example here,
- Machine 1 for API
- Machine 2 for Model Worker
But one question might be coming to your mind like if we put things on a different machines then how will talk?
- By network communication!!
- by invoking functions remotely (Remote Procedure Call[RPC])
Pasted image 20260524220840.png
- Now, let's see how the flow would look like.464
Now, think - do you expose internal machines publicly? like do you wants that anyone would come and make any no. of requests to our model?
- No right. So what we should do? But before that let's understand what all problems we can have if we happens to do so.
- huge amounts of bills thrown to your door.
- no authentication.
- model may die.
- Solution - Use Gateways
- public internet -> gateway only -> private subnet workers
Now, let's explore the two workers first
caller-worker or api-worker(Typescript)