馃Ч绗竴鎷涳細绠€鍗曠矖鏆村垹闄ゆ硶锛堟厧鐢紒锛?/h3>"鍒犲垹鍒狅紝鍒犲畬灏卞畬浜嬪効锛?鍏堝埆鎬ョ潃鐐瑰ご锛佽繖鎷" />
  1. 主页 > 大智慧

机器学习数据缺失处理全略:5种补差值方法详解

"浣犵殑妯″瀷鎬诲儚寰椾簡鍋ュ繕鐥囷紵璁粌鏃跺ソ濂界殑锛屼竴棰勬祴灏辨帀閾惧瓙锛熷叓鎴愭槸缂哄け鏁版嵁鍦ㄦ悶浜嬫儏锛? 浠婂ぉ鍜变滑灏辨潵鑱婅亰杩欎釜璁╂棤鏁版暟鎹汉鎶撶媯鐨勯棶棰樷€斺€旀暟鎹己澶辨€庝箞鐮达紵鏁欎綘浜旀嫑姹熸箹鏁戞€ユ硶锛屾墜鎶婃墜鎶婂潙濉钩锛?/p>


馃Ч绗竴鎷涳細绠€鍗曠矖鏆村垹闄ゆ硶锛堟厧鐢紒锛?/h3>

"鍒犲垹鍒狅紝鍒犲畬灏卞畬浜嬪効锛? 鍏堝埆鎬ョ潃鐐瑰ご锛佽繖鎷涘氨鍍忕粰鐮存礊瑁ゅ瓙鍓垚鐭¥锛岀‘瀹炶兘蹇€熻В鍐抽棶棰橈紝浣嗕唬浠峰彲鑳芥瘮浣犳兂璞$殑澶?..

鈥?strong>鈥嬮€傜敤鍦烘櫙鈥?/strong>鈥嬶細

  • 缂哄け鍊煎崰姣旓紲5%锛堟瘮濡?00琛屾暟鎹氨缂?琛岋級
  • 鏁版嵁閲忓ぇ鍒板彲浠ユ尌闇嶏紙姣斿鍗冧竾绾ф暟鎹級
  • 缂哄け瀹屽叏闅忔満锛堟瘮濡傜綉椤?璇寸殑MCAR绫诲瀷锛?/li>

鈥?strong>鈥嬫搷浣滄寚鍗椻€?/strong>鈥嬶細

python澶嶅埗
# 鍒犳暣琛?/span>
df.dropna(axis=0) 
# 鍒犳暣鍒?/span>
df.dropna(axis=1)

鈥?strong>鈥嬩妇涓牀瀛愨€?/strong>鈥嬶細

鎮h€匢D浣撴俯琛€鍘?/th>蹇冪巼
00136.512080
002NaN11575
00337.1NaN82

馃憠 鍒犺鍚庡彧鍓?01锛屽垹鍒楀悗鍙墿鎮h€匢D鍜屽績鐜?..杩欒繕鍒嗘瀽涓暐锛?/p>


馃搳绗簩鎷涳細缁熻涓夊墤瀹紙鍧囧€?涓綅鏁?浼楁暟锛?/h3>

"涓嶅氨鏄~绌哄槢锛屾暣閭d箞澶嶆潅骞插暐锛? 杩欐嫑灏卞儚鐢ㄨˉ涓佸竷璐寸牬娲烇紝铏界劧涓嶇編瑙傦紝浣嗚儨鍦ㄦ柟渚垮揩鎹凤紒

鈥?strong>鈥嬮€夊摢涓悎閫傦紵鈥?/strong>鈥?/p>

鏁版嵁绫诲瀷鎺ㄨ崘鏂规硶涓句釜鏍楀瓙
姝f€佸垎甯?/td>鍧囧€?/td>鍏ㄧ彮骞冲潎韬珮
鍋忔€佸垎甯?/td>涓綅鏁?/td>鍦板尯鏀跺叆姘村钩
绫诲埆鏁版嵁浼楁暟鐢ㄦ埛鏈€鍠滄鐨勬墜鏈洪鑹?/td>

鈥?strong>鈥嬮殣钘忛櫡闃扁€?/strong>鈥嬶細

  • 杩炵画鐢ㄥ潎鍊间細鎷夊钩娉㈠姩锛堣偂绁ㄦ暟鎹畬铔嬶級
  • 浼楁暟鍙兘瀵艰嚧绫诲埆澶辫 锛堝皬浼楃兢浣撹蹇借锛?/li>
  • 鐮村潖鍙橀噺鍏崇郴锛堢綉椤?鎻愬埌鐨勬椂闂村簭鍒楄秼鍔匡級

馃绗笁鎷涳細鏈哄櫒瀛︿範澶ф硶锛圞NN/闅忔満妫灄锛?/h3>

"璁〢I鏉ュ綋濉潙灏忚兘鎵嬶紒" 杩欐嫑鐩稿綋浜庤涓撲笟瑁佺紳琛ヨ。鏈嶏紝閽堣剼缁嗗瘑鍙堝悎韬綖

鈥?strong>鈥嬪疄鎴樺姣斺€?/strong>鈥嬶細

鏂规硶浼樼偣缂虹偣閫傜敤鍦烘櫙
KNN绠€鍗曟槗鎳?/td>璁$畻閲忕垎鐐?/td>灏忔暟鎹泦/鐗瑰緛灏?/td>
闅忔満妫灄鑷姩澶勭悊澶氶噸鍏辩嚎鎬?/td>瀹规槗杩囨嫙鍚?/td>楂樼淮鏁版嵁/闈炵嚎鎬у叧绯?/td>
绁炵粡缃戠粶鎹曟崏澶嶆潅鍏崇郴闇€瑕佸ぇ閲忔暟鎹?/td>鍥惧儚/鏂囨湰鏁版嵁

鈥?strong>鈥嬩唬鐮佺墖娈碘€?/strong>鈥嬶細

python澶嶅埗
from sklearn.impute import KNNImputer
imputer = KNNImputer(n_neighbors=3)
df_filled = imputer.fit_transform(df)

馃憠 缃戦〉4鎻愬埌鐨勭敓鎴愬鎶楃綉缁滐紙GAN锛夎櫧楂樼骇锛屼絾瀵瑰皬鐧芥潵璇村氨鍍忚灏忓鐢熷紑鐏鈥斺€旂帺涓嶈浆锛?/p>


馃幉绗洓鎷涳細澶氶噸鎻掕ˉ鐜勫锛堥珮绾х帺瀹跺繀澶囷級

"涓€娆′笉澶熷氨澶氬~鍑犳锛? 杩欐嫑鍍忕敤涓嶅悓棰滆壊鐨勭嚎鍙嶅缂濊ˉ锛屾渶缁堝緱鍒版渶鎺ヨ繎鍘熷竷鐨勫浘妗堛€?/p>

鈥?strong>鈥嬩笁姝ヨ蛋鎴樼暐鈥?/strong>鈥嬶細

  1. 鐢熸垚5-10涓~鍏呯増鏈?/li>
  2. 姣忎釜鐗堟湰鍗曠嫭寤烘ā
  3. 缁煎悎鎵€鏈夌粨鏋滃彇鏈€浼?/li>

鈥?strong>鈥嬩妇涓湡瀹炴渚嬧€?/strong>鈥嬶細
鏌愰摱琛岀敤杩欎釜鏂规硶澶勭悊瀹㈡埛鏀跺叆缂哄け锛屽潖璐﹂娴嬪噯纭巼鎻愬崌浜?8%锛?/p>


鈴崇浜旀嫑锛氭椂闂村簭鍒椾笓灞烇紙绾挎€?鏍锋潯鎻掑€硷級

"鏄ㄥぉ鐨勬暟鎹兘棰勬祴浠婂ぉ锛? 杩欐嫑灏卞儚鐢ㄨ繃鍘诲拰鏈潵鐨勬嫾鍥惧潡锛岃ˉ涓婂綋涓嬬殑绌虹己銆?/p>

鈥?strong>鈥嬫柟娉曞姣斺€?/strong>鈥嬶細

鎻掑€肩被鍨?/th>閫傜敤鍦烘櫙鏁板鍏紡
绾挎€?/td>骞崇紦鍙樺寲y = ax + b
浜屾鍔犻€熷彉鍖?/td>y = ax虏 + bx + c
鏍锋潯澶嶆潅娉㈠姩锛堝鑲′环锛?/td>鍒嗘澶氶」寮忓嚱鏁?/td>

馃憠 缃戦〉5鎻愬埌鐨凙RIMA妯″瀷铏藉噯锛屼絾閰嶇疆鍙傛暟姣旂粍瑁呬箰楂樿繕璐瑰姴锛?/p>


馃挕鐙瑙佽В锛堟媿鑴戣璇寸偣澶у疄璇濓級

骞蹭簡鍏勾鏁版嵁娓呮礂鐨勮€佸徃鏈哄憡璇変綘锛?/p>

  1. 鈥?strong>鈥嬩笟鍔$悊瑙o紴绠楁硶閫夋嫨鈥?/strong>鈥嬶細鐭ラ亾涓轰粈涔堢己澶辨瘮鎬庝箞濉ˉ鏇撮噸瑕侊紙姣斿缃戦〉6璇寸殑"Unknown"鍙兘闅愯棌鍏抽敭淇℃伅锛?/li>
  2. 鈥?strong>鈥嬫贩鍚堜娇鐢ㄦ槸鐜嬮亾鈥?/strong>鈥嬶細鍏堢敤鍒犻櫎娉曞鐞嗘槑鏄惧瀮鍦炬暟鎹紝鍐嶇敤鏈哄櫒瀛︿範濉ˉ鏍稿績鐗瑰緛
  3. 鈥?strong>鈥嬬暀涓獙璇侀泦鈥?/strong>鈥嬶細涓撻棬淇濈暀閮ㄥ垎缂哄け鏁版嵁妫€楠屽~琛ユ晥鏋?/li>

鏈€鍚庨€佸ぇ瀹朵竴鍙ヨ瘽锛氭暟鎹氨鍍忓璞★紝缂哄け涓嶆槸缂洪櫡锛岃€屾槸浜嗚ВTA鐨勬渶浣虫満浼氾紒涓嬫閬囧埌缂哄け鍊硷紝鍒€ョ潃鍒犳垨濉紝鍏堝潗涓嬫潵鍜屾暟鎹枬鏉挅鍟¤亰鑱婁汉鐢燂綖

本文由嘻道妙招独家原创,未经允许,严禁转载