WEKA API 請參閱官方網站:http://www.cs.waikato.ac.nz/ml/weka/
由於 WEKA 有自己專用的檔案格式 ARFF
但資料在記憶體中做完 preprocessing 以後,還得特地存成 *.arff 檔
再叫 WEKA 的 API 把 *.arff 檔讀進來,顯然是很蠢的事情=..=~
經過一番尋找,總算在官方文件中發現產生 Instances 資料庫的方法了!
範例程式碼
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
FastVector atts; FastVector attsRel; FastVector attVals; FastVector attValsRel; Instances data; Instances dataRel; double [] vals; double [] valsRel; int i; // 1. set up attributes atts = new FastVector(); // - numeric atts.addElement( new Attribute( "att1" )); // - nominal attVals = new FastVector(); for (i = 0 ; i < 5 ; i++) attVals.addElement( "val" + (i+ 1 )); atts.addElement( new Attribute( "att2" , attVals)); // - string atts.addElement( new Attribute( "att3" , (FastVector) null )); // - date atts.addElement( new Attribute( "att4" , "yyyy-MM-dd" )); // - relational attsRel = new FastVector(); // -- numeric attsRel.addElement( new Attribute( "att5.1" )); // -- nominal attValsRel = new FastVector(); for (i = 0 ; i < 5 ; i++) attValsRel.addElement( "val5." + (i+ 1 )); attsRel.addElement( new Attribute( "att5.2" , attValsRel)); dataRel = new Instances( "att5" , attsRel, 0 ); atts.addElement( new Attribute( "att5" , dataRel, 0 )); // 2. create Instances object data = new Instances( "MyRelation" , atts, 0 ); // 3. fill with data // first instance vals = new double [data.numAttributes()]; // - numeric vals[ 0 ] = Math.PI; // - nominal vals[ 1 ] = attVals.indexOf( "val3" ); // - string vals[ 2 ] = data.attribute( 2 ).addStringValue( "This is a string!" ); // - date vals[ 3 ] = data.attribute( 3 ).parseDate( "2001-11-09" ); // - relational dataRel = new Instances(data.attribute( 4 ).relation(), 0 ); // add data.add( new Instance( 1.0 , vals)); // second instance vals = new double [data.numAttributes()]; // important: needs NEW array! // - numeric vals[ 0 ] = Math.E; // - nominal vals[ 1 ] = attVals.indexOf( "val1" ); // - string vals[ 2 ] = data.attribute( 2 ).addStringValue( "And another one!" ); // - date vals[ 3 ] = data.attribute( 3 ).parseDate( "2000-12-01" ); // - relational dataRel = new Instances(data.attribute( 4 ).relation(), 0 ); // -- first instance valsRel = new double [ 2 ]; valsRel[ 0 ] = Math.E + 1 ; valsRel[ 1 ] = attValsRel.indexOf( "val5.4" ); dataRel.add( new Instance( 1.0 , valsRel)); // -- second instance valsRel = new double [ 2 ]; valsRel[ 0 ] = Math.E + 2 ; valsRel[ 1 ] = attValsRel.indexOf( "val5.1" ); dataRel.add( new Instance( 1.0 , valsRel)); vals[ 4 ] = data.attribute( 4 ).addRelation(dataRel); // add data.add( new Instance( 1.0 , vals)); //4. output data System.out.println(data); |
輸出的結果如下:
@relation MyRelation
@attribute att1 numeric
@attribute att2 {val1,val2,val3,val4,val5}
@attribute att3 string
@attribute att4 date yyyy-MM-dd
@attribute att5 relational
@attribute att5.1 numeric
@attribute att5.2 {val5.1,val5.2,val5.3,val5.4,val5.5}
@end att5
@data
3.141593,val3,'This is a string!',2001-11-09,'3.718282,val5.4\n4.718282,val5.1'
2.718282,val1,'And another one!',2000-12-01,'3.718282,val5.4\n4.718282,val5.1'
沒有留言:
張貼留言