DEPTH ANYTHING 3... IS COMING... #508

diverswan9 · 2025-10-19T12:57:37Z

TeeJay-NLD · 2025-10-19T15:43:43Z

TeeJay-NLD
Oct 19, 2025

... finally! Can't wait until its implemented within IW3. Let's wait and see... .

Thanks for sharing @diverswan9!

Cheers... TeeJay-NLD

0 replies

AIVFI · 2025-10-19T19:34:47Z

AIVFI
Oct 19, 2025

Amazing results! And the best part is that with Depth Anything 3 we can have an even better Video Depth Anything model!

Let's remember that Depth Anything 3 had been in development for a very long time and it was too late to benefit from some of the things that have come out recently.

The first thing to highlight is that Depth Anything 3 was trained on low resolution:

Training image resolutions are randomly sampled from 504 × 504, 504 × 378, 504 × 336, 504 × 280, 336 × 504.

The second thing is that it uses an old version of DINO:

Depth Anything 3 employs a single transformer (vanilla DINOv2 model) without any architectural modifications.

The newest DINOv3 has 3 things that I really missed in DINOv2:

1. Patch Size 16, instead of 14, see Table 2

We won't have to use resolutions divisible by 14 as with DA2 (training 518 × 518) or DA3 (training 504 × 504).

2. Stability of results even at very high resolutions. DINOv3 (ViT-H+) remains stable even at a resolution of 7168 × 4096, see Figure 17.

This opens the way for training and inference at, for example, 1280×720 or 3840 × 2160 resolution (divisible by 16 but not by 14).

3. Even better depth estimation results, confirmed on the most commonly used test data, see Table 12

Now try to imagine these results from the table above combined with the DA3 results from Table 3

And that's not all...

Two people have already submitted requests for Video Depth Anything based on DINOv3:
DepthAnything/Video-Depth-Anything#77
DepthAnything/Video-Depth-Anything#84

I am also preparing to submit a request to the Video Depth Anything researchers in the coming days. My request will include many more solutions than DINOv3. One of the many things I want to propose are the latest training datasets I have collected in my repository: Video Depth Estimation Rankings and 2D to 3D Video Conversion Rankings. And that's just the beginning of what I'll be offering Video Depth Anything researchers. I hope that someone will support my request, which I will make in a few days, and add something more from themselves.

3 replies

nagadomi Oct 20, 2025
Maintainer

DINOv3 is not under an open-source license, and downloading it requires registration. I will not support any models that use it. In that sense, it is a good thing that it uses DINOv2.

gituser123456789000 Oct 20, 2025

The first thing to highlight is that Depth Anything 3 was trained on low resolution:

Training image resolutions are randomly sampled from 504 × 504, 504 × 378, 504 × 336, 504 × 280, 336 × 504.

We'll see the results once we get to test it, but that sounds unfortunate that it's even lower resolution instead of higher resolution. If it's better, more accurate, great.. but seems that fine details, edges won't be up to par with higher resolution DepthPro for example. We'll see if it might scale to higher resolutions well instead of losing accuracy and depth degrading like rust holes in v2 when used at higher than native res.

AIVFI Oct 20, 2025

DINOv3 is not under an open-source license, and downloading it requires registration. I will not support any models that use it. In that sense, it is a good thing that it uses DINOv2.

Your own code can still be under the MIT licence, you just need to mention that you are using model A under licence B, which is based on backbone C under licence D. In addition, you attach copies of the licences and that's it. The DINOv3 licence is really friendly, please read it carefully. This licence is only to protect against certain specific uses of the code. Check out what DINOv3 can do and you will see how serious the consequences can be. Your project is not affected by this.

Now all the best models will be developed based on DINOv3. There is no going back to DINOv2. Less than 2 months ago MiDaS was shut down and DINOv2 will soon face the same.

DINOv2 is not stable at high resolutions, as someone here has already tested, and this cannot be changed.

Let me say it again: there will be no more new models based on DINOv2. The ones that will still appear are those whose development had already started before DINOv3 was published.

Your project is truly amazing, please think it through and don't miss the opportunity that the technological leap involved in the new DINOv3 backbone will provide.

KolaKater · 2025-10-19T20:01:58Z

KolaKater
Oct 19, 2025

hope Depth Anything 3 will address the common issue of missing vertical sharp edges in depth maps. All current models often fail to capture small, thin contours, which leads to distorted curved lines when the image is converted into 3D side-by-side format.
here I just use the example from @C-Three-P-O

0 replies

Uh oh!

DEPTH ANYTHING 3... IS COMING... #508

Uh oh!

diverswan9 Oct 19, 2025

Replies: 3 comments · 3 replies

Uh oh!

TeeJay-NLD Oct 19, 2025

Uh oh!

Uh oh!

AIVFI Oct 19, 2025

Uh oh!

nagadomi Oct 20, 2025 Maintainer

Uh oh!

gituser123456789000 Oct 20, 2025

Uh oh!

Uh oh!

AIVFI Oct 20, 2025

Uh oh!

Uh oh!

KolaKater Oct 19, 2025

diverswan9
Oct 19, 2025

Replies: 3 comments 3 replies

TeeJay-NLD
Oct 19, 2025

AIVFI
Oct 19, 2025

nagadomi Oct 20, 2025
Maintainer

KolaKater
Oct 19, 2025