Text-Conditional JEPA for Learning Semantically Rich Visual Representations

Text-Conditional JEPA for Learning Semantically Rich Visual Representations

More to explore