| Ŀѯ |ϵ | ղ | Ϊҳ | RSS
ǰλãҳ > г > ٿ >  > 

ҽѧ֪ʶͼ׹

ʱ䣺2019-06-28 05:57:48  Դ  ߣ

 ժҪҽѧ֪ʶͼʵǻҽƵĻʯЧ׼ҽƷȻ֪ʶͼ ׹ҽѧձЧʵͣƶ࣬չԲ⡣ҽݿ֣רҵǿṹӵص㣬Թҽѧ֪ʶͼ׵ĹؼԵϵȫҽѧ֪ʶʾȡںϺԼ岿ݡ⣬ҽѧ֪ʶͼϢ֪ʶʴϵҽƷеӦ״󣬽ϵǰҽѧ֪ ʶͼ׹ٵشս͹ؼ⣬䷢չǰչ

 

ؼʣ֪ʶͼף֪ʶȡ֪ʶںϣ֪ʶȻԴ

 

 

 

1998 ά֮ Tim Berners-Lee DzȵϱԿ͹⣬γ һ׼Ḷ́ͬʱӿ (Linked Open Data)ĹģɢԽԽ֪ʶԪݡ

 

֪ʶͼ׾Ĵݱ²һ֪ʶʾ͹ķʽǿ˹ܵչ£֪ʶͼ漰֪ʶȡʾںϡʴȹؼõһ̶ȵĽͻƣ֪ʶͼ׳Ϊ֪ʶһȵ㣬ܵѧߺ͹ҵ㷺ע

 

֪ʶͼ׵ǰ֪ʶ֯ͱ﷽ʹ֪ʶڼ֮ͼ֮佻ͨͼӹ˵һ֪ʶͼģʽͼͼ֮Ĺϵɣģʽͼ֪ʶĸǿϵʽģ ʽͼнڵǸʵ壬Ǹϵ part-ofͼǿһϵп͹ʵͼеĽڵ࣬һģʽͼеĸʵ壬ַͼеıǾʵģʽͼͼ֮Ĺϵָͼʵģʽͼĸ֮ĶӦ˵ģʽͼͼģߡ

 

    ֪ͨʶͼУȸ“Knowledge Graph”ѹ“֪”YAGODBpediaȣǾйģʶص㡣Ŀǰҽѧ֪ʶͼӦ Ĵֱ֮һϺҽԺҽҩ֪ʶͼסҽ֪ʶ SNOMED-CTIBM Watson HealthӦýҲʼߡ 

 

    ֪ʶͼܴݵǰо⣬Զеļ˳ӦϢʱķչ罥ʽģʽƣõݼɣ RDFOWLȱ׼֧֣֪ʶȡҽѧϢҽϢϵͳķչ˺ҽѧݡδЩϢԹӦãƽҽѧܻĹؼ⣬ҽѧ֪ʶٴϡҽӲ ܻĻ 
 

   Ƚҽѧ֪ʶͼ׵ĹҪ漰ҽѧ֪ʶʾҽѧ֪ʶȡʵ塢ϵԵijȡҽѧ֪ʶںϣҽѧ֪ʶ֡Ȼ˻ҽѧ֪ʶͼ׵İʴ𡢾ߵȵӦ״ҽѧ֪ʶͼ׵оӦص㣬ٵսԼδ ķչƽչ 

 

1 ҽѧ֪ʶͼ׹ 

 

    Ľҽѧ֪ʶͼ׹Ϊ岿֣ҽѧ֪ʶıʾȡںϡԼͨӴĽṹǽṹҽѧȡʵ塢ϵԵ֪ʶͼ׵ԪأѡЧķʽ֪ʶ⡣ҽѧ֪ʶں϶ҽѧ֪ʶݽӣǿ֪ʶڲ߼Ժͱͨ˹ԶķʽΪҽѧ֪ʶͼ׸¾֪ʶ򲹳֪ʶ֪ʶƳȱʧʵԶɼơDZݵҪֶΣҽѧ֪ʶͼ׵ĿŶȺ׼ȷȡ

 

1.1 ҽѧ֪ʶʾ 

 

֪ʶʾΪһԼ֪ʶŻ ʽģʽḶ́Ҫо洢֪ʶķ ʾʽӰϵͳ֪ʶȡ洢õЧʡȻҽѧ෱ӣ洢ʽһӲʽͱ׼ͬ漰ص㣬ҽѧ֪ʶʾ죬ͬʱҲҽѧ֪ʶʾս

 

    ҽ֪ʶõ֪ʶʾУν߼ʾʽʾܱʾʾȵȡ磬 SNOMED-CTڵ MYCIN ϵͳ󳦸˾ݿ EcoCycȡ֪ʶͼ֪ʶϵӻЩڱʾȱԣΪҪ֪ʶʾΪҽѧ֪ʶʾĸ򲹳䡣

 

    ʾʽʾ֪ʶԣʵ 1ϵ ʵ 2Ԫʾڵ㣨ʵ壩֪ʶͼ֮𽥵õϿɡʾ𣺱עʵȺ߸۽룬ҲиķչDZҲֶ ҪRDFRDF-SDAMLOWLȡʹñʾҽѧǿ󡢿ɻҽϢϵͳùҽݵṩڲͬ׼ͳƾۺϡҽѧĹҪҽѧĽṹ͸ܽɬǿԵҽѧ֪ʶЧرĿǰҽѧ֪ʶУҽѧ֪ʶLinkBaseTAMBIS(TaO)ȵȡ

 
    
֪ʶͼ׵ĽڵӰĽṹӶȼЧʺѶȡ֪ʶʾѧϰѧϰоϢʾΪܵάЧϡ⣬Ӷ֪ʶںϺ[άʾһֲַʽʾ distributed representationģʹöԪ洢ĹƣʹöάʾϢ֪ʶʾѧϰеĴģУṹʾStructure EmbeddingSEģͣsingle layer modelSLMģͣlatent factor modelLFMTransEķģ͵ȵȡЩģͿʵЭͬԺͼ㿪ʾʵ壬ٶԱʾʵϵӦľ任ۺʵԣΪ ֪֮ʶȫṩҪοKleyko ֤˷ֲʽʾʾҽѧͼзྫܹѾ䷽ͬHenriksson ˶Աʹö֪ʶʾʾ EHR 4 ¼ϼ¼ҩʹü¼ƷͲ̼¼Ȼ֪ʶʾѧϰΪҽѧ֪ʶͼ׵֪ʶʾ ˼·

 

1.2 ҽѧ֪ʶȡ 

 

    ҽѧ֪ʶͼ׵ĹҪǴӷǽṹ˹Զȡʵ塢ϵԡ˹ȡͨרһռϢȡ֪ʶĿǰͨ˹ҽѧ֪ʶٴҽѧ֪ʶ⡢SNOMED-CTICD-10 ȡԶȡûѧϰ˹ܡھϢȡԴԶȡ֪ʶͼ׵ĻԪءԶҽ ѧ֪ʶĵһ廯ҽѧϵͳ UMLS˹ȡĴ̫֪ʶԶȡĿǰصоҲǽ֪ʶͼ׵ơҪԶԴгȡ֪ʶϢʵ塢ϵԳȡ

 

1.2.1 ʵȡ 

 

    ʶıеҽѧʵ壬ĿͨʶؼһȡϵϢʶĸԱ׼ʽʾҽѧʵȡǴҽѧԴȡض͵ʵ塣ڽҽѧʵijȡΪࣺҽѧʵ估ķҽѧԴͳѧͻѧϰԼѧϰ

 

1ҽѧʵ估ķ 

 

    ÷ͨ˹ģʽƥɴʵʹҽѧʵгȡҽѧʵ壬÷ǾսԵġȣĿǰûֵ͵ʵ壬Լ򵥵ıƥ㷨DzӦʵʶġΣͬĵʻɸĵĸıָͬ壨磬
׿ʻʵҲԷٴΣ ҩʵͬʱӵжƣ PTEN MMACָͬĻ򣩡ˣҽѧʵ估ֻڱ㷺ʹá FriedmanͨԶģʽ﷨ʶӲеҽѧϢWuʹ CHV SNOMED-CTҽѧʵõ˲ʵȻ÷ܴﵽܸߵ׼ȷȣ޷׽⣬ҲרұдĴʵ͹޷Ӧҽѧʻ㲻ӿֵʵ

 

2ҽѧԴѧģ͵Ļѧϰ 

 

    ÷ͨʹͳѧͻѧϰҽѧԴصѵģͣʵʶӢҽѧʵȡ棬 ߴԵıע I2B2 2010ӢĵӲעϡ⣬ SemEvalNTCIR⣬Լ NCBIϿȣṩӢҽѧʵעݡ

 

    Ŀǰ÷ɷģͣHMMģ ͣCRF֧ģͣSVMȡKazama ʹ SVMģͽҽѧʵʶ POSʻ棬޼ලѵõ HMM״̬÷ GENIA Ͽ׼ȷʸرǷܽϸЧӦڴģϼZhouͨһϵѵ HMM ģͣʵĹ̬POS崥Ʊȡʶ׼ȷʴ66.5% GENIAϿеٻʴ66.6 ۺϷChen FriedmanMEDLEEϵͳʶҽѧıϢӦĶϵͳʹȻԼʶڿժҪдڵıͶҽѧʵʶ𳣳ʹýСı֪ʶ⡣Chen FriedmanԶصǧ UMLS ϸ幦ܺϸϰԼ鶯ﱾеļٸֶ˼ٸʵʵʶ׼ ȷʴ 64.0ٻʴ 77.1ȻߣΪ֮оԱṩһе˼· 

 

    ҽѧʵʶʹݬԼ˹ערҵҪߡĿǰרоνͶݱעоԭҪúδעݳģܣСнѧϰ̽ѧϰ֪ʶγһѧϰ̡

 

3ѧϰ 

 

    ѧϰʼ㷺ӦʵʶߴԵģ2011CollobertһģͣЧܳ˴ͳ㷨Sahu CNN RNN ķɴǶĿǰõ㷨ҲҪ̡ ҽѧWe ˻CRF˫RNN ʹSVM мʵʶĿǰҽѧϢʵʶѧϰģ BiLSTM-CRF ģͣJagannatha˶Ա CRFBiLSTMBiLSTM-CRF ģͣԼһЩǵĸĽģͣӢĵӲʵʶЧʵл LSTM ģͶ CRFЧã BiLSTM CRF ģܹһ 2%-5%׼ȷʡ

 

1.2.2 ʵȡ 

 

    ĽҽѧʵϵȡΪࣺ aͬҽѧʵ㼶ϵȡ缲“θ-θ”ȣbͬ͹ϵȡ“-֢״”ȡ

 

1ͬҽѧʵ㼶ϵȡ

 

    ͬҽѧʵ㼶ϵԽΪһҪ is-a part-ofϵҽѧϽѧϵҵ淶˴ϵҽѧʵ䡢ٿơϢ׼нС ICD-10SNOMEDҽƴʵҽݿصעҽѧרҵ޴ʻķ͸׼ȨҺǷΧ㣬϶ϣҽҵ㷺Ͽɣdzȡ㼶ʵϵѡԴԾҽƴʵ䡢֪ʶṩݸʽͿ API ӿڣͨ桢ʽD2RӳȼгȡֲṹȡԪƥ䡢 λϵ

 

2ͬҽѧʵϵȡ 

 

    ͬҽѧʵϵʶ»ͬԴʵ֡һǰٿƻṹԴ MedlineUMLS ȣǰṹĵӲ 

 

    ҽѧʵޣҪǼ֢״ơҩƷȣĿǰͨʵԤҪȡĹϵͣٽȡתΪԤʵϵĿǰδͳһı׼ȡҽѧ֪ʶͼ׹ģʽͼáʵʶԴĿļӦóȣ I2B2 2010 УӲеʵϵֳҽҽ⡢ҽơҽࡣ

 

    Uzuner ŶھӲȡҽʵϵʹʵ˳;롢﷨ʹʻѵ 6 SVM ͨԱʵ飬ָʻʵϵʶ Ҫáڴ˻ϣMedline ժҪFrunza ȳȡ˼Ƽֹϵ UMLS ҽʵȡ˲ʵ Abacha ͬ ʹ˹ģ SVM Ļģͣȡ 94.07%ƽ Fֵоָʱģƥ䷽Ҫãʱ SVM Ҫá

    

⣬ڹϵʶķ෽ԱоУBruijn I2B2 2010 жԱомලͻ Self-training İලı֣UMLS䷨δݶԹϵʶӰ졣ԤϵȻתΪķоģƥ䡢ͳƹֵȷȡϵ Medline ժҪͨͳƻĹȡϵݹ־˹ϵͼ Medline ժҪͨ﷨ͼģʽƥ䣬ȡϵ

 

1.2.3 Գȡ

 

    ԳȡָԺֵԣattribute-value pair AVPijȡԵijȡָΪҽѧʵ幹бҩƷ԰Ӧ֢֢ȡֵijȡָΪʵ帽Ӿֵ簢Īùع߽áijȡӿȡӽṹݿȡӰٿվȡӴֱվаװɡԼģʽƥӲѯ־ȡȡҽѧ֪ʶͼ˵ҪͨἰҽѧʵҽѧվСֵһǣǰ߹ԺֵԵԱȽϡ裨ر򣩣ҽѧվһȡϡ

 

     AVP ṹ̶ȱȽϸߵվάٿơA+ҽ ѧٿ6ȣйϢInfoBoxԷȡʶInfoBox ֱ֣ȡʵӦֵŶȸߣģСڸʽ졢ṹҽҩվʹֱı˵ͨǹվİװӴȡ վע͵ϸҳ(Detailed Pages)ЩҳͨģʽѧϰԶһ Xpath ʾģʽȻӦڸվϸҳдӶʵԶ AVP ȡ

 

1.3 ҽѧ֪ʶں

 

    ֪ʶںǸ߲ε֪ʶ֯ʹͬԴ֪ʶͬһܹ淶½ϡ硢ӹ֤µ 7Ŀǽ֪ʶõ⣬ǿ֪ʶڲ߼Ժͱ֪ʶͼвͬȵ֪ʶ֪ʶں ϸΪʵ롢֪ʶںϵȡ

 

1.3.1 ʵ 

 

ҽ֪ʶͼ֪ʶԴĶԵ֪ʶظ֪ʶݬ롢֪ʶȷ⡣ ҽѧʵڲͬԴдصĶԪָ⣬簢ùڰٶȰٿбΪϣ A+ҽѧٿбаùءùءùصȣƷ̩ءϣصȡ ʵҽѧ֪ʶںзdzҪһʵж϶Դ칹еʵǷָʵͬһĹ̡

 
    
еĶ㷨гɶʵ뼯ʵࡣɶʵ뷽ֻʵƶȣڴͳģ͵ʵ뷽ͻڻѧϰʵ뷽ǰFellegi˽ֵʵתΪĹΪӦںܶʵ빤У߳õĶ뷽зع㷨ID3 SVM ෽ѧϰܵȻмලѧϰķ޼ලѧϰµIJͼģ͵Ȼ޼ලѧϰķ 

 

    ʵڳɶʵĻڼʵƶʱʵ໥ϵΪֲʵȫּʵ롣ǰߵ㷨ʹռģͺƶȼʵԣ׼ȷʲߣٻʺٶȱȽϿɹۡͨͬƥ֮໥ӰʵƶȣַΪԴͻڸģ͵ļʵ뷽Դķͨʼƥ“bootstrapping” ʽزµƥ[Lacoste-Julien ڴ˻ SiGMa 㷨ʺϴģ֪ʶ⣬Ҫһ˹Ԥڸģ͵ķͨΪʵƥϵ;߽ӵĸģͣϵҶ˹ģ͡LDA ģ͡CRF ģͺ Markov߼ģ͵ȣƥЧЧʻдߡ 

 

    Բ֪ͬʶԴݳݳͻʱҪ֪ʶԴĿɿԼͬϢڸ֪ʶԴгֵƵȵءͮ]ڹҽҩ֪ʶͼʱԴĿŶȽ֣ڲͬԴгֵĴ 䵽ӦֵֶС 

 

    ֪ʶģʵӣ֪ʶеʵԽԽܵӣ׼ȷЧʵδ֪ʶ ϵоص֮һ

 

1.3.2 ֪ʶں 

 

    ֪ʶʱͬᵼ֪ʶݵĶԺ칹ԡӵҽ֪ʶ˵ǰ֪ʶⶼijһij༲ҩģƢθ ֪ʶ[60]ҽҩ֪ʶͼ׵ȣҪõƵҽ֪ʶͼףҪԲͬҽ֪ʶںԼδǵ֪ʶͲϲ֪ʶںϵе֪ʶͼСҽ֪ʶͼ׵ĹһϵµĹ̡ 

 

    ֪ʶںϵоʼ“ƥ”ԱԽƥ䡣֪ʶģͽṹӻԼʵ֮໥ϵҲΪǵءSuchanek Ļڸʵ֪ʶں㷨PAIRS֪ʶΪ룬ܹЧؿ籾ͬʱʵԺ͹ϵ PARIS Ҫһ˹롣ľ֪ޣԶش Web лȡ֪ʶںʮֱҪDong ˽Ԫʽȡʵʹ PRA ģַ Freebase ͼõ֪ʶںһ֪ʶںϷ ɴﵽԶ Web ģĸ֪ʶˮ׼Чʡ

 

    ҽDieng-Kuntz ˽ҽݿתΪҽƱ壬ȻıʹðԶԹ߽ȡ˹¶ԱչͲȫʽԶ֪ʶĸΡBaorto ˽ԴӵٴϢϵͳʱȷݵĿǷѾڣȻӵ MED( Medical Entities Dictionary)ͬʱԱ֤ݵһԡ 

 

    Ŀǰҽ֪ʶͼ׵ںϼһЩijԣҪ˹ԤЧ֪ʶں㷨дһоҽ֪ʶͼҲԿDzڰķʽ֪ʶںϡ

 

1.4 ҽѧ֪ʶ 

 

    Ǵ֪ʶھϢ֪ʶע֪ʶ뷽ѡã˹룬Ƴȱʧʵ⡣ҽѧ֪ʶͼУ֪ʶҽɲѼƣҽƲʡȻʹͬļҽҲݲ״ͬϣ
ҽѧ֪ʶͼױ봦ظìܵϢ˹ҽѧģ͵ĸԡ ͳ֪ʶл߼Description Logic DLڹRule-based ReasoningCBRڰCase-based ReasoningCBRȵȡBousquet C ʹû DAML+OIL ߼ִĽҩ ᆵϵͳźż⣻Chen R ˲ RBR ϵͳṩҩ飻CARE-PARNER ϵͳǻCBRϽƷȵȡ 

 

    ͳ֪ʶһ̶ƶҽԶ̣Ҳѧϰ㣬ʲߣ׼ȷʴȱݣԶδﵽʵӦõҪҽݣʱɱػϢ©ʱӳ ⡣˹ܣ˹磨Artificial Neural NetworksANNsӵдӺھϢȻơ ART-KNNART-Kohonen neural network CBR ϿߺЧʺ׼ȷȡģͣneural tensor networks FreeBase ȿű϶δ֪ϵ׼ȷʿɴ 90.0%Karegowda A G Pima ӡڰ֪ʶ⣨PIDDʹŴ㷨 Genetic AlgorithmGAͷ򴫲磨Back Propatation NetworkBPNĻģͣ׼ȷ 7%ҡ

 

    ѧϰ֪ʶͼΪԴͬͼ֪ʶͼΪͼʵΪڵ㣬ԹϵΪߣùϵ·ҵڵĶಽ·Path Ranking PTransEԭƶʵϵͼݿʹ֪ʶͼͼݽṹд洢봫ͳݿȣǰڸάȹѯЧߡȻͼݿδ죬޷̫ӵ֪ʶΪеͼ ݿ Neo4jTitanOrientDB ArangoDBȡ껷ܵҽ뻼߰ȫϵͳУͽҽƱݴ AllegroGraph ͼݿС

 

    ֪ͨʶͼƣҽѧ֪ʶͼҲжԿ֪ʶ֪ʶģ֪ʶо

 

1.5  

 

    ݵֱӰݵãDZݵҪֶΣɸѡŶȸߵݡҽѧ϶ݺҽѧ֪ʶͼ׵ĿŶȺ׼ȷ˸ߵҪǹҽ֪ʶͼ׵һǹᴩ ֪ʶͼ׵ڣ 2013 걾ᣨOntology Summit2013ͶԱڵĸ׶Ҫеص˵

 

    Ŀǰ֪ʶͼ/ɷΪĴࣺڻƽ ׼ķڱ/Ӧõķķͻָķ 1 Ա⼸ֱ

 

    嵽ҽѧ򣬱Ӧó죺Clarke ʹûĴ 2004 2012 ܣBright ʹñԭר Ϊָڿؾ֧ϵͳеЧGordon ͨӲϰٴʵ“ ׼”ĽȾ BCIDOȡΪûбͼӿ챾Զ̣߷װͬĹߴӲͬӽǶԱIJָͬIJص㣬ѡʵĹ߲ܶԱ ӦҪ

 

    ȽϳõıУODEvalOOPSOntoManagerCoreȡҽѧ壬ҽѧ֪ʶͼ׻ӶϢˣ רϢ֪ʶȷ 

 

    ֪ͨʶͼףҽѧ֪ʶͼԣaҽѧϽԣۺ϶ַжǶȵ Bright ñԭרеȼ bõȼϸߵľ棨alerts봦صľ-΢ﲻƥ侯ҩ ƼľԿƾƷ-֢״䲻ƥ侯ȣc˴ʽ֪ʶͼ⣬Ҳעڼ֪ʶ࣬Ϊ֪ʶȫԺ׼ȷԽֱӰٴֵ֧Ŷȡ⣬ҽѧ֪ʶͼںϼѧڶѧƵĽѧƣָ겻ܼ򵥵հijѧƵָ꣬ӦۺϿڶء֪ʶͼ/±ʾ