• 0 Posts
  • 2 Comments
Joined 2 years ago
cake
Cake day: July 18th, 2023

help-circle

  • So I’m still on the fence about the AI arms race in general. However, reading up on DeepSeek it feels like they built a model specifically to work well on the benchmarks.

    I say this cause it’s a Mixture of Experts approach, so only parts of the model are used at any given point. The drawback is generalization.

    Additionally, it isn’t a multimodal model and the only place I’ve seen real opportunity for workflows automation is using the multimodal models. I guess you could use a combination of models, but that’s definitely a step back from the grand promise of these foundational models.

    Overall, I’m just not sure if this is lay people getting caught up in hype or actually a significant change in the landscape.