Leetao’s Space

📖主题 FastVLM：高效视觉编码的视觉语言模型

🚩重点

• FastViTHD编码器输出更少tokens，编码时间显著降低
• 最小变体比LLaVA-OneVision-0.5B快85x，视觉编码器小3.4x
• 大型变体使用Qwen2-7B LLM，TTFT提升7.9x，超越Cambrian-1-8B

✨结论推荐使用FastVLM进行高分辨率图像处理，适合移动设备应用，提供多种模型和详细训练指引。

🏷️标签 #机器学习 #视觉语言模型

🔗链接 https://github.com/apple/ml-fastvlm

GitHub

GitHub - apple/ml-fastvlm: This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision…

This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025 - apple/ml-fastvlm

216 viewsGlittering, 10:37

Leetao’s Space

这本书的内容真的很好用～有需要的可以收藏备用

218 views06:26

Leetao’s Space

这本书的内容真的很好用～有需要的可以收藏备用

微信读书有这本书的电子档

218 views06:28

Leetao’s Space

https://github.com/RubyMetric/chsrc

GitHub

GitHub - RubyMetric/chsrc: chsrc 全平台通用换源工具与框架. Change Source everywhere for every software

chsrc 全平台通用换源工具与框架. Change Source everywhere for every software - RubyMetric/chsrc

203 views08:33

Leetao’s Space

https://www.reddit.com/r/Python/comments/1kmwdbu/microsoft_layoffs_hit_faster_cpython_team/

From the Python community on Reddit

Explore this post and more from the Python community

212 views13:17

Leetao’s Space

https://x.com/karminski3/status/1922457081564611068

X (formerly Twitter)

karminski-牙医 (@karminski3) on X

来看个神奇的项目——mergekit

这个 python 项目可以将多个大模型合并为一个，比如你有一个模型A，感觉它写作特别好，然后有个模型B，感觉它写代码很好，那么用这个 mergekit 就能将两个模型合并为一个。除此之外，在模型之间迁移能力也是可以的。

174 views12:18

Leetao’s Space

https://snarky.ca/unravelling-t-strings/

Tall, Snarky Canadian

Unravelling t-strings

PEP 750 introduced t-strings for Python 3.14. In fact, they are so new that as of Python 3.14.0b1 there still isn't any documentation yet for t-strings. 😅 As such, this blog post will hopefully help explain what exactly t-strings are and what you might use…

175 views04:58

Leetao’s Space

好久不上知乎了，知乎又搞起了圈子这个东西？

187 views15:14

Leetao’s Space

https://blog.jacobstechtavern.com/p/the-side-hustle-from-hell

Jacobstechtavern

How I Got Exploited At My First Startup

11 months in The Side Hustle From Hell

168 views02:05

Leetao’s Space

Forwarded from Garyの梦呓

Google IO 大会更新

- Gemini 2.5 Pro Deep Think
加入新型增强推理模式，在回应前会探索多种假设，能够更有效地处理复杂的数学和编程问题。2025 USAMO 和LiveCodeBench 新 SOTA
- Gemini Diffusion
一个小规模扩散模型 LLM，性能超过2.0 FlashLite 的同时速度超2000token/s
- 发布 Imagen 4, Veo 3, Flow
Imagen 4 能够渲染织物、水滴和动物毛发等精细细节,创建分辨率高达 2K 的图像。Veo 3 提生了质量，而且首次可以生成带有音频的视频。Flow 整合了 Veo、Imagen 和 Gemini。用户可以使用自然语言描述分镜，并使用 Flow 将各种要素编织成优美的场景。

其他杂讯：
- Gemini advanced 改名 AI Pro 并推出 AI Ultra，比CloaseAI划算就行（
- Chrome 内置 Gemini
- Stitch 可生成UI设计稿并导出至Figma
- NotebookLM 支持视频摘要，ai studio支持预览交互式的Gemini SDK应用

195 views02:53

2025/07/02 05:27:43
Back to Top

HTML Embed Code:

<iframe width="100%" src="https://www.tgoop.com/buyppe/web?embed=1" title="Telegram Web" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>