保险行业中数据挖掘技术的运用

美国Paper代写范文:“保险行业中数据挖掘技术的运用”,这篇论文主要描述的是随着数据库技术的广泛运用,在各行各业的数据库中已经积累了大量的数据,我们能够从这些数据当中挖掘出更深层次的信息,我们如何才能够更有效的利用这些数据?本文以保险行业为例讲述了数据挖掘技术对于保险行业发展的重要性。

paper代写,数据挖掘技术,留学生作业代写,保险数据挖掘,论文代写

1 Introduction

With the rapid development of database technology and database management systems widely used, more and more data accumulate all walks of life. Growing surge of data hidden behind a lot of important information that people want to be able to be a higher level of analysis in order to make better use of the data. The current database systems can efficiently implement data entry, query, statistics and other functions, but can not find the data relationships and rules exist, can not be based on existing data to predict future trends. Lack of knowledge hidden behind data mining tools, led to the "data explosion but knowledge poor" phenomenon.

With the development of computer and network technology, access to a particular industry relevant information has been feasible. For large quantities, involving a wide range of data, relying on the traditional simple summary of the specified model to analyze the statistical methods of data analysis can not be completed. Therefore, an intelligent analysis of information technology - "data mining" (Data Mining) came into being.

Data Mining (Data Mining) is a large, incomplete, noisy, fuzzy, random data to extract implicit in them, people are not known in advance, but is potentially useful information and knowledge in the process . By mining data warehouse to store large amounts of data, and found a new association meaningful patterns and trends in the process. Data mining is a new business information processing technology, is a large number of commercial database business data extraction, transformation, analysis and processing of other models to extract critical data supporting business decisions. So that enterprises in the fierce market competition opportunities. As for the insurance industry, currently has a broad market demand.

2 Item Description

The project has developed "the insurance industry decision system V1.0". The main interface of system operation using ASP programming: data preprocessing, customers to buy insurance analysis, customer buying habits analysis and the results output functions; background database using the Sql Server 2005 network database implementation; mining tools using SPSS Clementine 11.0; experiments in the study stage Apriori algorithm exists for "Storage complexity" and "a lot of redundant rules," two major drawbacks of the algorithm to improve through the use of a pattern tree structure to reduce the complexity of storage Apriori algorithm, while reducing the appearance of redundant rules .

The system consists of: data preprocessing, customers to buy insurance analysis, customer buying habits analysis and the results output and other major functional block

(1) "preprocessing" modules include: upload, data platform, data processing, statistics, and other functions to generate data sets.

● Upload: to be completed by all branches Insurance Corporation under the data upload.● Data Platform: allows the data before uploading data platform to choose.

● Data processing: cleaning up the data, format conversion and other operations.

● Statistics: The preprocessed data analysis, extraction efficacy data.

● generate data sets: the statistical data generating process to extract the active data set, to provide a higher quality data mining data source

(2) "customers to buy insurance analysis" modules include: data import, parameter setting, result analysis and other functions.

● Data Import: In this user interface, by selecting different data platform will go through "data preprocessing" generated data sets were imported.

● Parameter setting: In this user interface settings "support", "confidence" and other parameters for effective analysis of the data set with the value range of the data record filter.

● Analysis: In this user interface can be "customers to buy insurance analysis," the final results of the analysis to the "report", "chart" format display, the results of this analysis for the industry to provide a "same customer buy our various (sub) insurance "customer information, thus providing the industry" to win customers' decision-making basis.

(3) "customer buying habits of" modules include: data import, parameter setting, result analysis and other functions.● Data Import: This operation is the same (2) "customers to buy insurance analysis" module "Data Import."

● Parameter setting: In this setting, respectively, "Input Parameters" (including: age, gender, occupation and other basic customer information) and "Output Parameters" (customers buy insurance information).
● Analysis: With this interface can demonstrate customer buying habits analysis, thus providing the industry "to retain customers' decision-making basis.

(4) "analysis result output" modules include: "Analysis of customers to buy insurance" and "customer buying habits analysis" of the print output results

Three projects improved fast algorithm

Since Apriori algorithm time and space complexity is high and there is a large amount of redundant rules two major defects. Therefore, this project through the use of a pattern tree structure to reduce the complexity of storage Apriori algorithm, while reducing redundant rules appear

3.1 a pattern tree structureroot is the one labeled as "null" the root, root root following the child's program as a prefix sub-tree collection, as well as project head table composition; tree each node contains four fields user_id, count, node_link, node_next. Which, user_id is user tags (uniquely identifies a user), count for the parent node of the node reaches the number of paths, node_link point to the same tree the user_id next node to the next node, the moment a node does not exist, node_link is null, node_next pointing to its child nodes in the tree; program header table for each table entry contains three fields: user_id, count, head of node, user_id with the same meaning as defined in the tree, count as user_id of the tree and all the same, head of node points to the tree with the same user_id value of the first node pointer

3.2 Creating Pattern Tree

Algorithm is as follows:

Let the transaction database as A, one of the items set to Ai.

Algorithm: Patterntree (tree, p), constructed pattern tree

Input: A transaction database user

Output: User mode treeProcedure Patterntree (T,

{Create_ tree (T) ;/ / create a Pattern-Tree root node to "null" mark

t = T; / / t for the current node

While A <> null do

{Read into a transactional database item set Ai

while p! = null

do

{If p.user_id == t ancestors n.user_id

then

{N.count = n.count + l;

t = n;

}

Elseif p.user_id == T kids c.user_id

then

{C.count = c.count + l;

t = c;

}

else

insert_Patterntree (T, p) ;/ / put p as a new node into the tree, as the current node's child nodes
p = p.next;

}

}

}

3.3 pairs pattern tree pruning

Pattern tree is established, there may be a large number of redundant branches, in order to ensure that the data mining results will not be the redundant branches affected by the noise generated, so the need for tree pruning, removing noise information.

Algorithm: SPT (Tree, a), by calling the model tree pruning algorithm

/ / SPT to support pattern tree, ie Supported Access Pattern Tree; a head table for the project

Input: Pattern tree PatternTree, Min_Sup (Pattern Tree minimum support)

Output: After pruning the support pattern tree SPT, mode B = {bi | i = 1,2,3 ...... n}

SPT (Tree, a)

{I = 1;

While (ai! = null) / / for the project head table in a one

{

if (ai.count> = Min_Sup)

then

{

Mode bi = ai.head of node;

p = ai.head of node ;/ / p in the schema tree pointing ai

Location

While (p! = null and ai.count> = Min_Sup)

{

Find the prefix p group, the p-group, and p connection prefix, configuration

Into Mode b;

if (bi.count> = Min_Sup)

then

{

/ / Bi.count the mode p and p b is the base of the prefix

The minimum count

P in the schema bi retain their prefixes base;

bi = bi. node_link

}

else

{

Depending on the mode of p and b prefix base deletion

PatternTree the corresponding node, a child node reconfiguration

With the parent node, and modify the project header table ai;

p = p. node_next / / p points in the pattern tree

Next position;   

}

}

}

else

{

Modify the project head node ai value;

Delete mode corresponding node in the tree and prefix-based, reconstruction Sons

Node;

i + +;

}

}

 

}

The establishment of the tree can be avoided through mode multiple scans the transaction database; while taking advantage count field effectively retains the number of itemsets to avoid generating a large number of frequent itemsets, for reducing the complexity of space-time has played a certain role. Tree structure can be avoided through a large amount of redundant rules.

Through the pattern tree pruning, tree can be deducted in the pattern generation process produces a large number of redundant branches, played a role in reducing the space complexity, and can utilize the output mode B production rules, to avoid a number of sets appears frequently, reducing the time complexity.

4 Conclusion

The project tree structure by mode improved Apriori algorithm, Apriori algorithm to make up for the defects. This method is not only capable of Apriori algorithm from time complexity and space complexity to improve on, while avoiding the generation of intermediate rules. This study shows that by using a pattern tree structure to reduce the complexity of storage Apriori algorithm, while reducing the appearance of redundant rules, which improved Apriori algorithm is an effective measure.

51due留学教育原创版权郑重声明:原创留学生作业代写范文源自编辑创作,未经官方许可,网站谢绝转载。对于侵权行为,未经同意的情况下,51Due有权追究法律责任。

51due为留学生提供最好的college letter代写服务,亲们可以进入主页了解和获取更多paper代写范文 提供最专业的美国作业代写