CN109543178A - A kind of judicial style label system construction method and system - Google Patents
A kind of judicial style label system construction method and system Download PDFInfo
- Publication number
- CN109543178A CN109543178A CN201811294777.8A CN201811294777A CN109543178A CN 109543178 A CN109543178 A CN 109543178A CN 201811294777 A CN201811294777 A CN 201811294777A CN 109543178 A CN109543178 A CN 109543178A
- Authority
- CN
- China
- Prior art keywords
- label
- vocabulary
- text
- judicial
- accuracy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/18—Legal services
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Tourism & Hospitality (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Primary Health Care (AREA)
- Marketing (AREA)
- Human Resources & Organizations (AREA)
- Technology Law (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本申请提供了一种司法文本标签体系构建方法及系统。通过分词工具获取司法词汇文本,根据词频统计构建初级标签体系,对初级标签体系中语义相近的标签进行合并,对生涩的标签进行扩展,获得扩展标签体系,利用文本测试集,统计扩展标签体系搜索文本的准确度,验证当前的扩展标签体系是否构建完成,否则进一步优化标签体系。实现对不同法律构建针对性的标签体系,大大提高了司法文本的搜索精度。
The present application provides a method and system for constructing a judicial text label system. Obtain judicial lexical texts through word segmentation tools, build a primary tag system based on word frequency statistics, merge tags with similar semantics in the primary tag system, expand jerky tags, and obtain an extended tag system. The accuracy of the text verifies whether the current extended label system is completed, otherwise the label system is further optimized. Realize the construction of a targeted label system for different laws, which greatly improves the search accuracy of judicial texts.
Description
| 1 label of merit | Merit 1 is applicable in law article | ×× method first | First strip label |
| 2 label of merit | Merit 2 is applicable in law article | ×× method Article 2 | Second strip label |
| … | … | … | … |
| Merit N label | Merit N is applicable in law article | Other methods × article | N strip label |
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811294777.8A CN109543178B (en) | 2018-11-01 | 2018-11-01 | Method and system for constructing judicial text labeling system |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811294777.8A CN109543178B (en) | 2018-11-01 | 2018-11-01 | Method and system for constructing judicial text labeling system |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN109543178A true CN109543178A (en) | 2019-03-29 |
| CN109543178B CN109543178B (en) | 2023-02-28 |
Family
ID=65846358
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201811294777.8A Active CN109543178B (en) | 2018-11-01 | 2018-11-01 | Method and system for constructing judicial text labeling system |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN109543178B (en) |
Cited By (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110675241A (en) * | 2019-08-15 | 2020-01-10 | 上海新颜人工智能科技有限公司 | Label calibration system and method |
| CN110928981A (en) * | 2019-11-18 | 2020-03-27 | 佰聆数据股份有限公司 | Method, system and storage medium for establishing and perfecting iteration of text label system |
| CN110929513A (en) * | 2019-10-31 | 2020-03-27 | 北京三快在线科技有限公司 | Text-based label system construction method and device |
| CN111177388A (en) * | 2019-12-30 | 2020-05-19 | 联想(北京)有限公司 | Processing method and computer equipment |
| CN111221974A (en) * | 2020-04-22 | 2020-06-02 | 成都索贝数码科技股份有限公司 | Method for constructing news text classification model based on hierarchical structure multi-label system |
| CN111353045A (en) * | 2020-03-18 | 2020-06-30 | 智者四海(北京)技术有限公司 | Method for constructing text classification system |
| CN111524043A (en) * | 2020-04-24 | 2020-08-11 | 南京擎盾信息科技有限公司 | Method and device for automatically generating litigation risk assessment questionnaire |
| CN111666771A (en) * | 2020-06-05 | 2020-09-15 | 北京百度网讯科技有限公司 | Semantic label extraction device, electronic equipment and readable storage medium of document |
| CN112084290A (en) * | 2019-06-13 | 2020-12-15 | 北京沃东天骏信息技术有限公司 | Data retrieval method, device, equipment and storage medium |
| CN112148868A (en) * | 2020-09-27 | 2020-12-29 | 南京大学 | Law recommendation method based on law co-occurrence |
| CN112365372A (en) * | 2020-10-09 | 2021-02-12 | 银江股份有限公司 | Judgment document oriented quality detection and evaluation method and system |
| CN112925902A (en) * | 2021-02-22 | 2021-06-08 | 新智认知数据服务有限公司 | Method and system for intelligently extracting text abstract in case text and electronic equipment |
| CN113065312A (en) * | 2020-01-02 | 2021-07-02 | 北京沃东天骏信息技术有限公司 | Method and device for extracting text labels |
| CN113505192A (en) * | 2021-05-25 | 2021-10-15 | 平安银行股份有限公司 | Data tag library construction method and device, electronic equipment and computer storage medium |
| CN113948087A (en) * | 2021-09-13 | 2022-01-18 | 北京数美时代科技有限公司 | Voice tag determination method, system, storage medium and electronic equipment |
| CN114254116A (en) * | 2021-12-30 | 2022-03-29 | 智慧芽信息科技(苏州)有限公司 | Document text classification method, classification model construction method and classification device |
| CN114647745A (en) * | 2022-03-22 | 2022-06-21 | 广东省电信规划设计院有限公司 | Label expanding method and device for document search and computer storage medium |
| CN115293141A (en) * | 2022-06-23 | 2022-11-04 | 中国第一汽车股份有限公司 | Method, system and electronic device for preparing vehicle-mounted normalized vocabulary |
| CN118886406A (en) * | 2024-07-15 | 2024-11-01 | 广州泰司贝网络科技有限公司 | A legal text generation system and method based on intelligent interaction |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2004318381A (en) * | 2003-04-15 | 2004-11-11 | National Institute Of Advanced Industrial & Technology | Synonymity calculation method, synonymity calculation program, computer-readable recording medium on which synonymity calculation program is recorded |
| JP2017078919A (en) * | 2015-10-19 | 2017-04-27 | 日本電信電話株式会社 | Word expansion device, classification device, machine learning device, method, and program |
| CN106682149A (en) * | 2016-12-22 | 2017-05-17 | 湖南科技学院 | Label automatic generation method based on meta-search engine |
| CN107577785A (en) * | 2017-09-15 | 2018-01-12 | 南京大学 | A Hierarchical Multi-Label Classification Approach for Legal Identification |
-
2018
- 2018-11-01 CN CN201811294777.8A patent/CN109543178B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2004318381A (en) * | 2003-04-15 | 2004-11-11 | National Institute Of Advanced Industrial & Technology | Synonymity calculation method, synonymity calculation program, computer-readable recording medium on which synonymity calculation program is recorded |
| JP2017078919A (en) * | 2015-10-19 | 2017-04-27 | 日本電信電話株式会社 | Word expansion device, classification device, machine learning device, method, and program |
| CN106682149A (en) * | 2016-12-22 | 2017-05-17 | 湖南科技学院 | Label automatic generation method based on meta-search engine |
| CN107577785A (en) * | 2017-09-15 | 2018-01-12 | 南京大学 | A Hierarchical Multi-Label Classification Approach for Legal Identification |
Cited By (27)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112084290B (en) * | 2019-06-13 | 2024-04-05 | 北京沃东天骏信息技术有限公司 | Data retrieval method, device, equipment and storage medium |
| CN112084290A (en) * | 2019-06-13 | 2020-12-15 | 北京沃东天骏信息技术有限公司 | Data retrieval method, device, equipment and storage medium |
| CN110675241A (en) * | 2019-08-15 | 2020-01-10 | 上海新颜人工智能科技有限公司 | Label calibration system and method |
| CN110929513A (en) * | 2019-10-31 | 2020-03-27 | 北京三快在线科技有限公司 | Text-based label system construction method and device |
| CN110928981A (en) * | 2019-11-18 | 2020-03-27 | 佰聆数据股份有限公司 | Method, system and storage medium for establishing and perfecting iteration of text label system |
| CN111177388A (en) * | 2019-12-30 | 2020-05-19 | 联想(北京)有限公司 | Processing method and computer equipment |
| CN111177388B (en) * | 2019-12-30 | 2023-07-21 | 联想(北京)有限公司 | Processing method and computer equipment |
| CN113065312A (en) * | 2020-01-02 | 2021-07-02 | 北京沃东天骏信息技术有限公司 | Method and device for extracting text labels |
| CN111353045A (en) * | 2020-03-18 | 2020-06-30 | 智者四海(北京)技术有限公司 | Method for constructing text classification system |
| CN111353045B (en) * | 2020-03-18 | 2023-12-22 | 智者四海(北京)技术有限公司 | Method for constructing text classification system |
| CN111221974A (en) * | 2020-04-22 | 2020-06-02 | 成都索贝数码科技股份有限公司 | Method for constructing news text classification model based on hierarchical structure multi-label system |
| CN111221974B (en) * | 2020-04-22 | 2020-08-14 | 成都索贝数码科技股份有限公司 | Method for constructing news text classification model based on hierarchical structure multi-label system |
| CN111524043A (en) * | 2020-04-24 | 2020-08-11 | 南京擎盾信息科技有限公司 | Method and device for automatically generating litigation risk assessment questionnaire |
| CN111666771A (en) * | 2020-06-05 | 2020-09-15 | 北京百度网讯科技有限公司 | Semantic label extraction device, electronic equipment and readable storage medium of document |
| CN111666771B (en) * | 2020-06-05 | 2024-03-08 | 北京百度网讯科技有限公司 | Semantic tag extraction device, electronic equipment and readable storage medium for document |
| CN112148868A (en) * | 2020-09-27 | 2020-12-29 | 南京大学 | Law recommendation method based on law co-occurrence |
| CN112365372A (en) * | 2020-10-09 | 2021-02-12 | 银江股份有限公司 | Judgment document oriented quality detection and evaluation method and system |
| CN112365372B (en) * | 2020-10-09 | 2024-01-12 | 银江技术股份有限公司 | Quality detection and evaluation method and system for referee document |
| CN112925902A (en) * | 2021-02-22 | 2021-06-08 | 新智认知数据服务有限公司 | Method and system for intelligently extracting text abstract in case text and electronic equipment |
| CN112925902B (en) * | 2021-02-22 | 2024-01-30 | 新智认知数据服务有限公司 | Method, system and electronic equipment for intelligently extracting text abstract from case text |
| CN113505192A (en) * | 2021-05-25 | 2021-10-15 | 平安银行股份有限公司 | Data tag library construction method and device, electronic equipment and computer storage medium |
| CN113948087A (en) * | 2021-09-13 | 2022-01-18 | 北京数美时代科技有限公司 | Voice tag determination method, system, storage medium and electronic equipment |
| CN114254116A (en) * | 2021-12-30 | 2022-03-29 | 智慧芽信息科技(苏州)有限公司 | Document text classification method, classification model construction method and classification device |
| CN114647745A (en) * | 2022-03-22 | 2022-06-21 | 广东省电信规划设计院有限公司 | Label expanding method and device for document search and computer storage medium |
| CN115293141A (en) * | 2022-06-23 | 2022-11-04 | 中国第一汽车股份有限公司 | Method, system and electronic device for preparing vehicle-mounted normalized vocabulary |
| CN118886406A (en) * | 2024-07-15 | 2024-11-01 | 广州泰司贝网络科技有限公司 | A legal text generation system and method based on intelligent interaction |
| CN118886406B (en) * | 2024-07-15 | 2025-02-07 | 广州泰司贝网络科技有限公司 | Legal text generation system and method based on intelligent interaction |
Also Published As
| Publication number | Publication date |
|---|---|
| CN109543178B (en) | 2023-02-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109543178A (en) | A kind of judicial style label system construction method and system | |
| CN108509425B (en) | A Novelty-based Chinese New Word Discovery Method | |
| CN106598944B (en) | A kind of civil aviaton's security public sentiment sentiment analysis method | |
| CN104615593B (en) | Hot microblog topic automatic testing method and device | |
| CN104391942B (en) | Short essay eigen extended method based on semantic collection of illustrative plates | |
| CN102929873B (en) | Method and device for extracting searching value terms based on context search | |
| CN102929937B (en) | Based on the data processing method of the commodity classification of text subject model | |
| CN103678576B (en) | The text retrieval system analyzed based on dynamic semantics | |
| CN111190900B (en) | JSON data visualization optimization method in cloud computing mode | |
| CN101231634B (en) | Autoabstract method for multi-document | |
| CN103235774B (en) | A kind of science and technology item application form Feature Words extracting method | |
| CN110059311A (en) | A kind of keyword extracting method and system towards judicial style data | |
| CN101751455B (en) | A Method of Automatically Generating Headlines Using Artificial Intelligence Technology | |
| CN110442760A (en) | A kind of the synonym method for digging and device of question and answer searching system | |
| CN106951438A (en) | A kind of event extraction system and method towards open field | |
| CN103309862B (en) | Webpage type recognition method and system | |
| CN114706972B (en) | An automatic generation method of unsupervised scientific and technological information summaries based on multi-sentence compression | |
| CN106294744A (en) | Interest recognition methods and system | |
| CN109597995A (en) | A kind of document representation method based on BM25 weighted combination term vector | |
| CN114491062B (en) | Short text classification method integrating knowledge graph and topic model | |
| CN114048305A (en) | A similar case recommendation method for administrative punishment documents based on graph convolutional neural network | |
| CN110781679A (en) | News event keyword mining method based on associated semantic chain network | |
| CN101923556B (en) | Method and device for searching webpages according to sentence serial numbers | |
| CN107145514A (en) | Chinese sentence pattern sorting technique based on decision tree and SVM mixed models | |
| CN108062351A (en) | Text snippet extracting method, readable storage medium storing program for executing on particular topic classification |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| CB02 | Change of applicant information |
Address after: 310012 1st floor, building 1, 223 Yile Road, Hangzhou City, Zhejiang Province Applicant after: Yinjiang Technology Co.,Ltd. Address before: 310012 1st floor, building 1, 223 Yile Road, Hangzhou City, Zhejiang Province Applicant before: ENJOYOR Co.,Ltd. |
|
| CB02 | Change of applicant information | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20190329 Assignee: HANGZHOU ENJOYOR SMART CITY TECHNOLOGY GROUP CO.,LTD. Assignor: Yinjiang Technology Co.,Ltd. Contract record no.: X2024980042648 Denomination of invention: A method and system for constructing a judicial text labeling system Granted publication date: 20230228 License type: Common License Record date: 20250102 |
|
| EE01 | Entry into force of recordation of patent licensing contract |