Unfortunately neural networks are pretty fixed in their layer sizes. The reason that arbitrarily increasing the size wouldn’t work is that for anything larger than a one-layer linear network, the nonlinear interactions of intermediate layers would be completely changed by adjusting the number of layers, and convergence would likely be extremely difficult.
What I would suggest instead is coming up with a way of representing the possibility of a large number of states and actions within the context of a fixed input and output. For example, imagine an input state that is a vector of size 3. This could represent RGB colors by using continuous variables for each. In such a case then the agent may only experience a small number of “colors” at the beginning of its training, and then as it experiences more colors, there is a natural way to represent them.
I hope this makes sense and is helpful.