While I understand your point, the fate is worse than you think. ImageNet, as any other dataset, has unintentional errors (mislabeled images).. This error is likely about 1 to 5% of the dataset. Thus, 99.9% accuracy would mean the algorithm got even the dataset's errors right, see?
At some point, say 97.5%, there won't be any path for improvement, as we reached the point that any improvement is just matching the dataset error. However, that doesn't mean people will stop trying.... thus, it will never end.
The more likely outcome is that one day people will just forget about it (rather than "solving" it")