转自 | AI开发者 编译 | JocelynWang
https://github.com/cgnorthcutt/cleanlab
https://l7.curtisnorthcutt.com/cleanlab-python-package
https://github.com/cgnorthcutt/confidentlearning-reproduce
-
多标签图像(蓝色):图像中有多个标签;
-
本体论问题(绿色):包括“是”(比如:将浴缸标记为桶)或 “有”(比如:示波器标记为CRT屏幕)两种关系,在这些情况下,数据集应该包含其中一类;
-
标签错误(红色):当数据集别的类的标签比给定的类标签更适合于某个示例时,就会显示标签错误
什么是置信学习?
-
描述噪声标签 -
寻找标签误差 -
采用噪声标签学习 -
寻找本体论问题
-
样本外预测概率(矩阵大小:#类的#样本数)。 -
噪声标签(矢量长度:样本数量)。
置信学习的好处
-
可直接估计噪声与真实标签的联合分布 -
适用于多类别的数据集 -
查找标签错误(错误按最有可能到最不可能的顺序排列) -
无需迭代(在ImageNet中查找训练集的标签错误需要3分钟) -
具有理论合理性(在真实条件下可以准确地找到标签错误和一致的联合分布估算) -
不需要做随机均匀的标签噪声的假设(在实践中通常不现实) -
只需要预测概率和噪声标签(可以使用任何模型) -
无需任何真实(保证无损)的标签 -
可以自然扩展到多标签数据集 -
可用于描述、查找和学习标签错误,CLEANLAB Python包是免费且开源的。
置信学习的原则
https://papers.nips.cc/paper/5073-learning-with-noisy-labels.pdf https://arxiv.org/abs/1505.07634 https://arxiv.org/abs/1609.03683
http://www.jmlr.org/papers/volume18/15-226/15-226.pdf https://dl.acm.org/citation.cfm?id=1403849 https://arxiv.org/abs/1802.03916
原论文链接:https://arxiv.org/abs/1911.00068
置信学习是如何实现的?
置信学习的实践应用
最后的想法
<pre style="max-width: 100%;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-right: 8px;margin-left: 8px;max-width: 100%;white-space: normal;color: rgb(0, 0, 0);font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-align: center;widows: 1;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.5px;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;font-size: 16px;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.5px;box-sizing: border-box !important;overflow-wrap: break-word !important;">—</span></strong>完<strong style="max-width: 100%;font-size: 16px;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.5px;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;font-size: 16px;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.5px;box-sizing: border-box !important;overflow-wrap: break-word !important;">—</span></strong></span></strong></span></strong></section><section style="max-width: 100%;white-space: normal;font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-align: center;widows: 1;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section powered-by="xiumi.us" style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-top: 15px;margin-bottom: 25px;max-width: 100%;opacity: 0.8;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section powered-by="xiumi.us" style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-top: 15px;margin-bottom: 25px;max-width: 100%;opacity: 0.8;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-right: 8px;margin-bottom: 15px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;color: rgb(127, 127, 127);font-size: 12px;font-family: sans-serif;line-height: 25.5938px;letter-spacing: 3px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(0, 0, 0);box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;font-size: 16px;font-family: 微软雅黑;caret-color: red;box-sizing: border-box !important;overflow-wrap: break-word !important;">为您推荐</span></strong></span></section><p style="margin-right: 8px;margin-bottom: 5px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;color: rgb(127, 127, 127);font-size: 12px;font-family: sans-serif;line-height: 1.75em;letter-spacing: 0px;box-sizing: border-box !important;overflow-wrap: break-word !important;">“12306”的架构到底有多牛逼?</p><p style="margin-right: 8px;margin-bottom: 5px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;color: rgb(127, 127, 127);font-size: 12px;font-family: sans-serif;line-height: 1.75em;letter-spacing: 0px;box-sizing: border-box !important;overflow-wrap: break-word !important;">中国程序员34岁生日当天在美国遭抢笔记本电脑,追击歹徒被拖行后身亡,为什么会发生此类事件?</p><p style="margin-right: 8px;margin-bottom: 5px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;color: rgb(127, 127, 127);font-size: 12px;font-family: sans-serif;line-height: 1.75em;letter-spacing: 0px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;-webkit-tap-highlight-color: rgba(0, 0, 0, 0);cursor: pointer;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">阿里如何抗住90秒100亿?看这篇你就明白了!</span></p><p style="margin-right: 8px;margin-bottom: 5px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;font-family: sans-serif;line-height: 1.75em;letter-spacing: 0px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(87, 107, 149);box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">60个Chrome神器插件大收集:助你快速成为老司机,一键分析网站技术栈</span></span><br style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;" /></p><p style="margin-right: 8px;margin-bottom: 5px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;color: rgb(127, 127, 127);font-size: 12px;font-family: sans-serif;line-height: 1.75em;letter-spacing: 0px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;-webkit-tap-highlight-color: rgba(0, 0, 0, 0);cursor: pointer;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">深度学习必懂的13种概率分布</span></p></section></section></section></section></section></section></section></section>
本篇文章来源于: 深度学习这件小事
本文为原创文章,版权归知行编程网所有,欢迎分享本文,转载请保留出处!
内容反馈