資源簡(jiǎn)介
基于知識(shí)圖譜的智能問答系統(tǒng)python實(shí)現(xiàn)(復(fù)旦大學(xué)論文基于qa語料和知識(shí)庫的問答系統(tǒng))
代碼片段和文件信息
#!?-*-?coding:utf-8?-*-
import?pickle
from?connectSQLServer?import?connectSQL
host?=?‘172.16.54.33‘
user?=?‘sa‘
password?=?‘chentian184616_‘
database=?‘chentian‘
querySQL?=?connectSQL(host?user?password?database)
class?Pro_onlines(object):
????def?__init__(selfEV):
????????self.EV=EV
????????self.sql_current_evp=“SELECT?COUNT(*)?FROM?[chentian].[dbo].[baike_triples1]?WHERE?entity?=‘%s‘?AND?property=‘%s‘“
????????self.sql_baidutag=“SELECT?value?FROM?[chentian].[dbo].[baike_triples1]?WHERE?entity?=‘%s‘?AND?property=‘BaiduTAG‘“
????????#?self.entity_values=entity_values
????????#?self.value_entities=value_entities
????????self.concept_fre=pickle.load(open(“./../data/concept_count.pkl“‘rb‘))
????def?calculate_piq(selfque1):
????????“““
????????計(jì)算當(dāng)前問題qi的每一個(gè)實(shí)體的概率,可以認(rèn)為是當(dāng)前實(shí)體對(duì)當(dāng)前問題的重要程度,計(jì)算了三個(gè)概率,并且這三個(gè)概率,都可以通過當(dāng)前問題,進(jìn)行計(jì)算,
????????計(jì)算了p(e|q)的概率,計(jì)算時(shí)由于對(duì)EV對(duì)可能有多個(gè)重復(fù)實(shí)體記錄,所以需要把分母進(jìn)行累加計(jì)算,具體看代碼56-57行,分子為所有實(shí)體額記錄頻數(shù)。
????????第二個(gè)概率變化為,根據(jù)兩篇論文,最終采用當(dāng)前實(shí)體對(duì)應(yīng)的類別的概率采用e:{c1:pre1c2:pre2...}的形式,
????????第三個(gè)概率,論文中提到對(duì)于e,p的多個(gè)value采用均勻概率,并且唯一value概率為一。至此三個(gè)概率
????????:param?qi:?當(dāng)前問題,以及對(duì)應(yīng)的三元組形成的數(shù)據(jù)
????????:return:?返回當(dāng)前問題中每個(gè)實(shí)體對(duì)應(yīng)p(e/q)已經(jīng)求出
????????“““
????????print(que1.keys()“$$$$$$$$$$$$$$“)
????????evi=list(que1.values())[0]#問題中的所有(實(shí)體-屬性-值)
????????currente_pre1?=?{}?#當(dāng)前問題的第一個(gè)概率p(e|qi)
????????currente_pre2?=?{}??#?是每一個(gè)實(shí)體對(duì)應(yīng)value不同實(shí)體的頻數(shù)
????????current_pteq={}#對(duì)于問題模板的類別概率問題??e_c?=?{}??#?保存每一個(gè)實(shí)體對(duì)應(yīng)的類別概率e:{c1:pre1c2:pre2...}
????????current_pvep={}#對(duì)于當(dāng)前問題的實(shí)體意圖對(duì)應(yīng)的value值得概率
????????for?key?in?evi.keys():
????????????e_c_pre?=?{}??#?當(dāng)前問題每一個(gè)實(shí)體e對(duì)應(yīng)的類別c的頻數(shù)。
????????????epv=key.split(“&&&&&“)#接下來對(duì)每一個(gè)v?遍歷每一個(gè)問題中所有的相同v得到對(duì)應(yīng)的實(shí)體e,并且記錄實(shí)體出現(xiàn)的頻數(shù)?實(shí)體e可能出現(xiàn)多次對(duì)第一個(gè)概率沒有影響,但是對(duì)第二個(gè)有影響,本來有結(jié)果,
????????????#?????????????????????????重復(fù)第二次沒有對(duì)應(yīng)baidutag,則會(huì)重新賦值為空,
????????????if?v!=‘‘?and?p!=‘‘?and?v!=‘‘:
????????????????current_e?=?0??#?當(dāng)前實(shí)體對(duì)應(yīng)的頻數(shù)?分子
????????????????current_alle?=?0??#?對(duì)當(dāng)前value的不同實(shí)體記總數(shù)?分母
????????????????entity_value_temp=self.entity_values[e]#得到對(duì)應(yīng)實(shí)體的value以及頻數(shù)
????????????????value_entity_temp=self.value_entities[v]#得到對(duì)應(yīng)value的
????????????????for?entity_keyentity_pre?in?entity_value_temp.items():
????????????????????if?entity_key==v:
????????????????????????current_e=entity_pre
????????????????????????current_alle=sum(list(value_entity_temp.values()))
????????????????????????currente_pre1[e]=float(current_e)/float(current_alle)
????????print(currente_pre1)
????????????#?current_pvep_pre=querySQL.Query(self.sql_current_evp%(ep))[‘‘][0]?#計(jì)算同一實(shí)體e同一意圖p的不同值v的個(gè)數(shù)
????????#?????current_pe=0?#當(dāng)前實(shí)體,對(duì)應(yīng)類別(pe)共同滿足的個(gè)數(shù)
????????#?????current_allp=0?#當(dāng)前實(shí)體的頻數(shù)在整個(gè)EV中,作為求類別的分母。
????????#?????for?que_ev?in?self.EV:?#整個(gè)for循環(huán)就把所有的實(shí)體遍歷所有問題
????????#?????????current_evi=list(que_ev.values())[0]?#當(dāng)前EV當(dāng)前問題的所有實(shí)體對(duì)
????????#?????????for?key1?in?current_evi.keys():??#對(duì)于每一個(gè)實(shí)體對(duì)
????????#?????????????e1p1v1=key1.split(“&&&&&“)
????????#?????????????if?v?==v1:?#如果value相同
????????#?????????????????current_alle+=1?#對(duì)應(yīng)實(shí)體的value其他共有多少實(shí)體的頻數(shù)
????????#?????????????????if?e1==e:curr
評(píng)論
共有 條評(píng)論