Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

有关训练 T.D 的一些疑问 #21

Open
lym0302 opened this issue Oct 23, 2024 · 2 comments
Open

有关训练 T.D 的一些疑问 #21

lym0302 opened this issue Oct 23, 2024 · 2 comments

Comments

@lym0302
Copy link

lym0302 commented Oct 23, 2024

您好,想请教一下关于训练 T.D 的一些问题。论文中提到用AVSync15 数据集来训练T.D, 这个数据没有时间戳信息,只有分类和视频(命名类似于:6wHFhrAqt5Q_000023_000033_5.5_8.5.mp4), 请问怎么用来训练 timestamp detector,按道理训练数据应该有目标时间标记(音频每一帧是 1(有声) 还是0(无声) 的标记),这个时间标记怎么获取?

个人推测例如视频名称为: 6wHFhrAqt5Q_000023_000033_5.5_8.5.mp4, 则首先对应 训练数据是 vggsound 数据中的 6wHFhrAqt5Q_000023.mp4, 然后时间标记为1 的是 6wHFhrAqt5Q_000023.mp4中的 5.5~8.5, 其余的时间段的目标时间标记为0,是这样的吗?

请大佬们指教,感激不尽~~~

@Basums
Copy link

Basums commented Nov 15, 2024

但是说实话6wHFhrAqt5Q_000023_000033_5.5_8.5.mp4里面的3s标签也很粗糙

@Basums
Copy link

Basums commented Nov 15, 2024

一般来说是要用librosa.onset

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants