By a News Reporter-Staff News Editor at Information Technology Newsweekly -- Researchers detail new data in Information Forensics and Security. According to news reporting out of San Antonio, Texas, by VerticalNews editors, research stated, "Over 20 studies have been published in the past decade involving file and data type classification for digital forensics and information security applications. Methods using n-grams as inputs have proven the most successful across a wide variety of types; however, there are mixed results regarding the utility of unigrams and bigrams as inputs independently."
Our news journalists obtained a quote from the research from the University of Texas, "In this study, we use support vector machines (SVMs) consisting of unigrams and bigrams, as well as complexity and other byte frequency-based measures, as inputs. Using concatenated unigrams and bigrams as input and a linear kernel SVM, we achieve significantly improved results over those previously reported (73.4% classification rate across 38 file and data types). We are the first to use concatenated n-grams as the sole input, and we show their superiority over inputs used previously. We also found that too many different types of features as inputs result in overfitting and poor generalization properties. We include several types seldom or not studied in the past (Microsoft Office 2010 files, file system data, base64, base85, URL encoding, flash video, M4A, MP4, WMV, and JSON records)."
According to the news editors, the research concluded: "The 'winning' approach is instantiated in an open source software tool called Sceadan-Systematic Classification Engine for Advanced Data ANalysis."
For more information on this research see: Sceadan: Using Concatenated N-Gram Vectors for Improved File and Data Type Classification. IEEE Transactions on Information Forensics and Security, 2013;8(9):1519-1530. IEEE Transactions on Information Forensics and Security can be contacted at: Ieee-Inst Electrical Electronics Engineers Inc, 445 Hoes Lane, Piscataway, NJ 08855-4141, USA. (Institute of Electrical and Electronics Engineers - www.ieee.org/; IEEE Transactions on Information Forensics and Security - ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=10206)
Our news journalists report that additional information may be obtained by contacting N.L. Beebe, Univ Texas San Antonio, Management Sci & Stat Department, San Antonio, TX 78249, United States. Additional authors for this research include L.A. Maddox, L.S. Liu and M.H. Sun.
Keywords for this news article include: Texas, San Antonio, United States, North and Central America, Information Forensics and Security
Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2013, NewsRx LLC