San Jose, California, United States
TikTok Search Video Retrieval and Ranking (leading video indexing)
• Introduced LLM to search indexing for for measuring video quality to improve content quality.
• Improved search indexing by introducing OCR and adding new prediction head to the model. Improved retrieval by coverage by 20% in experimentation.
• Redesigned offline indexing pipeline primarily by migrating from MapReduce to PySpark, reducing runtime of pipeline from 30 hours to 4 hours and reducing max cpu cores needed from 60k to 15k.