Which to use depends on the nature of the problem. One isn’t necessarily “worse” or “better.” In Fact one of Andrej’s colleagues at OpenAI recently proved that the two methods have a strong theoretical connection, and in many cases are computing the same thing. https://arxiv.org/abs/1704.06440
From a practical perspective, DQN is more likely to work in cases with more separable state and action spaces, whereas PG works better with more continuous state and action spaces. PG also updates faster because it takes advantage of full monte-carlo backups, but is less capable of exploring the state space fully, since it relies on on-policy updates. On the flip side DQN uses one-step backups, however it’s off-policy nature allows for greater exploration.
I hope that gives some additional intuition.