publications

publications by categories in reversed chronological order.

2022

  1. FedSPLIT: One-Shot Federated Recommendation System Based on Non-negative Joint Matrix Factorization and Knowledge Distillation
    Eren, Maksim E, Richards, Luke E, Bhattarai, Manish, Yus, Roberto, Nicholas, Charles, and Alexandrov, Boian S
    arXiv preprint arXiv:2205.02359 2022
  2. General-Purpose Unsupervised Cyber Anomaly Detection via Non-Negative Tensor Factorization
    Eren, Maksim Ekin, Moore, Juston, Skau, Erik, Bhattarai, Manish, Moore, Elisabeth, Chennupati, Gopinath, and Alexandrov, Boian
    Digital Threats: Research and Practice 2022

2021

  1. MOTIF: A Large Malware Reference Dataset with Ground Truth Family Labels
    Joyce, Robert J., Amlani, Dev, Nicholas, Charles, and Raff, Edward
    2021
  2. A Framework for Cluster and Classifier Evaluation in the Absence of Reference Labels
    Joyce, Robert J., Raff, Edward, and Nicholas, Charles
    Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security 2021
  3. Rank-1 Similarity Matrix Decomposition For Modeling Changes in Antivirus Consensus Through Time
    Joyce, Robert J., Raff, Edward, and Nicholas, Charles
    2021
  4. COVID-19 Multidimensional Kaggle Literature Organization
    Eren, Maksim Ekin, Solovyev, Nick, Hamer, Chris, McDonald, Renee, Alexandrov, Boian, and Nicholas, Charles
    In Proceedings of the ACM Symposium on Document Engineering 2021 2021
  5. Random Forest of Tensors (RFoT)
    Eren, Maksim Ekin, Nicholas, Charles, McDonald, Renee, and Hamer, Chris
    Presented at the 12th Annual Malware Technical Exchange Meeting, Online. 2021
  6. Evading Malware Classifiers via Monte Carlo Mutant Feature Discovery
    Boutsikas, John, Eren, Maksim Ekin, Varga, Charles, Raff, Edward, Matuszek, Cynthia, and Nicholas, Charles
    Presented at the 12th Annual Malware Technical Exchange Meeting, Online. 2021
  7. Bringing UMAP Closer to the Speed of Light with GPU Acceleration
    Nolet, Corey J., Lafargue, Victor, Raff, Edward, Nanditale, Thejaswi, Oates, Tim, Zedlewski, John, and Patterson, Joshua
    In The Thirty-Fifth AAAI Conference on Artificial Intelligence 2021
  8. Research Reproducibility as a Survival Analysis
    Raff, Edward
    In The Thirty-Fifth AAAI Conference on Artificial Intelligence 2021
  9. Classifying Sequences of Extreme Length with Constant Memory Applied to Malware Detection
    Raff, Edward, Fleshman, William, Zak, Richard, Anderson, Hyrum S., Filar, Bobby, and McLean, Mark
    In The Thirty-Fifth AAAI Conference on Artificial Intelligence 2021
  10. Accounting for Variance in Machine Learning Benchmarks
    Bouthillier, Xavier, Delaunay, Pierre, Bronzi, Mirko, Trofimov, Assya, Nichyporuk, Brennan, Szeto, Justin, Sepah, Naz, Raff, Edward, Madan, Kanika, Voleti, Vikram, Kahou, Samira Ebrahimi, Michalski, Vincent, Serdyuk, Dmitriy, Arbel, Tal, Pal, Chris, Varoquaux, Gaël, and Vincent, Pascal
    In Machine Learning and Systems (MLSys) 2021
  11. Exact Acceleration of K-Means ++ and K-Means
    Raff, Edward
    In 30th International Joint Conference on Artificial Intelligence (IJCAI-21) 2021
  12. Generating Thermal Human Faces for Physiological Assessment Using Thermal Sensor Auxiliary Labels
    Ordun, Catherine, Raff, Edward, and Purushotham, Sanjay
    In ICIP 2021
  13. Leveraging Uncertainty for Improved Static Malware Detection Under Extreme False Positive Constraints
    Nguyen, Andre T., Raff, Edward, Nicholas, Charles, and Holt, James
    In IJCAI-21 1st International Workshop on Adaptive Cyber Defense 2021

2020

  1. Multi-Dimensional Anomalous Entity Detection via Poisson Tensor Factorization
    Eren, Maksim Ekin, Moore, Juston, and Alexandrov, Boian
    In 2020 IEEE International Conference on Intelligence and Security Informatics (ISI) 2020
  2. Flexible and Adaptive Fairness-aware Learning in Non-stationary Data Streams
    Zhang, Wenbin, Zhang, Mingli, Zhang, Ji, Liu, Zhen, Chen, Zhiyuan, Wang, Jianwu, Raff, Edward, and Messina, Enza
    In 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI) 2020
  3. Sampling Approach Matters: Active Learning for Robotic Language Acquisition
    Pillai, Nisha, Raff, Edward, Ferraro, Francis, and Matuszek, Cynthia
    In 2020 IEEE International Conference on Big Data (Big Data) 2020
  4. COVID-19 Kaggle Literature Organization
    Eren, Maksim Ekin, Solovyev, Nick, Raff, Edward, Nicholas, Charles, and Johnson, Ben
    In Proceedings of the ACM Symposium on Document Engineering 2020 2020
  5. A Survey of Machine Learning Methods and Challenges for Windows Malware Classification
    In NeurIPS 2020 Workshop: ML Retrospectives, Surveys & Meta-Analyses (ML-RSA) 2020
  6. The Use of AI for Thermal Emotion Recognition: A Review of Problems and Limitations in Standard Design and Data
    Ordun, Catherine, Raff, Edward, and Purushotham, Sanjay
    In AAAI FSS-20: Artificial Intelligence in Government and Public Sector 2020
  7. Robust Design of Deep Neural Networks against Adversarial Attacks based on Lyapunov Theory
    Rahnama, Arash, Nguyen, Andre T., and Raff, Edward
    In The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
  8. Automatic Yara Rule Generation Using Biclustering
    Raff, Edward, Zak, Richard, Munoz, Gary Lopez, Fleming, William, Anderson, Hyrum S., Filar, Bobby, Nicholas, Charles, and Holt, James
    In 13th ACM Workshop on Artificial Intelligence and Security (AISec’20) 2020
  9. A New Burrows Wheeler Transform Markov Distance
    Raff, Edward, Nicholas, Charles, and McLean, Mark
    In The Thirty-Fourth AAAI Conference on Artificial Intelligence 2020
  10. Exploratory Analysis of Covid-19 Tweets using Topic Modeling, UMAP, and DiGraphs
    Ordun, Catherine, Purushotham, Sanjay, and Raff, Edward
    In epiDAMIK 2020: 3rd epiDAMIK ACM SIGKDD International Workshop on Epidemiology meets Data Mining and Knowledge Discovery 2020
  11. Cluster Quality Analysis Using Silhouette Score
    Shahapure, Ketan Rajshekhar, and Nicholas, Charles
    In 7th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2020, Sydney, Australia, October 6-9, 2020 2020
  12. A Quantum Algorithm To Locate Unknown Hashes For Known N-Grams Within A Large Malware Corpus
    Allgood, Nicholas R., and Nicholas, Charles K.
    CoRR 2020

2019

  1. A Step Toward Quantifying Independently Reproducible Machine Learning Research
    Raff, Edward
    In Advances in Neural Information Processing Systems 2019
  2. PyLZJD: An Easy to Use Tool for Machine Learning
    Raff, Edward, Aurelio, Joe, and Nicholas, Charles
    In Proceedings of the 18th Python in Science Conference 2019
  3. KiloGrams: Very Large N-Grams for Malware Classification
    Raff, Edward, Fleming, William, Zak, Richard, Anderson, Hyrum, Finlayson, Bill, Nicholas, Charles K., and Mclean, Mark
    In Proceedings of KDD 2019 Workshop on Learning and Mining for Cybersecurity (LEMINCS’19) 2019
  4. Barrage of random transforms for adversarially robust defense
    Raff, E., Sylvester, J., Forsyth, S., and McLean, M.
    In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2019
  5. Non-Negative Networks Against Adversarial Attacks
    Fleshman, William, Raff, Edward, Sylvester, Jared, Forsyth, Steven, and McLean, Mark
    AAAI-2019 Workshop on Artificial Intelligence for Cyber Security 2019
  6. Would a File by Any Other Name Seem as Malicious?
    Nguyen, Andre T, Raff, Edward, and Sant-Miller, Aaron
    In 2019 IEEE International Conference on Big Data (Big Data) 2019

2018

  1. Malware Detection by Eating a Whole EXE
    Raff, Edward, Barker, Jon, Sylvester, Jared, Brandon, Robert, Catanzaro, Bryan, and Nicholas, Charles
    In AAAI Workshop on Artificial Intelligence for Cyber Security 2018
  2. Static Malware Detection & Subterfuge: Quantifying the Robustness of Machine Learning and Current Anti-Virus
    Fleshman, William, Raff, Edward, Zak, Richard, McLean, Mark, and Nicholas, Charles
    In 2018 13th International Conference on Malicious and Unwanted Software (MALWARE) 2018
  3. Engineering a Simplified 0-Bit Consistent Weighted Sampling
    Raff, Edward, Sylvester, Jared, and Nicholas, Charles
    In Proceedings of the 27th ACM International Conference on Information and Knowledge Management 2018
  4. Gradient Reversal Against Discrimination : A Fair Neural Network Learning Approach
    Raff, Edward, and Sylvester, Jared
    In The 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA) 2018
  5. Lempel-Ziv Jaccard Distance, an effective alternative to ssdeep and sdhash
    Digital Investigation 2018
  6. Hash-Grams: Faster N-Gram Features for Classification and Malware Detection
    In Proceedings of the ACM Symposium on Document Engineering 2018 2018

2017

  1. Learning the PE Header, Malware Detection with Minimal Domain Knowledge
    Raff, Edward, Sylvester, Jared, and Nicholas, Charles
    In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security 2017
  2. What can N-grams learn for malware detection?
    Zak, Richard, Raff, Edward, and Nicholas, Charles
    In 2017 12th International Conference on Malicious and Unwanted Software (MALWARE) 2017
  3. An Alternative to NCD for Large Sequences, Lempel-Ziv Jaccard Distance
    In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’17 2017
  4. JSAT: Java Statistical Analysis Tool, a Library for Machine Learning
    Raff, Edward
    Journal of Machine Learning Research 2017
  5. Malware Classification and Class Imbalance via Stochastic Hashed LZJD
    In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security 2017
  6. Document Engineering Issues in Malware Analysis
    Nicholas, Charles K.
    In Proceedings of the 2017 ACM Symposium on Document Engineering, DocEng 2017, Valletta, Malta, September 4-7, 2017 2017

2016

  1. An investigation of byte n-gram features for malware classification
    Raff, Edward, Zak, Richard, Cox, Russell, Sylvester, Jared, Yacci, Paul, Ward, Rebecca, Tracy, Anna, McLean, Mark, and Nicholas, Charles
    Journal of Computer Virology and Hacking Techniques 2016
  2. Document Engineering Issues in Malware Analysis
    Nicholas, Charles K., and Brandon, Robert
    In Proceedings of the 2016 ACM Symposium on Document Engineering, DocEng 2016, Vienna, Austria, September 13 - 16, 2016 2016

2015

  1. Document Engineering Issues in Document Analysis
    Nicholas, Charles K., and Brandon, Robert
    In Proceedings of the 2015 ACM Symposium on Document Engineering, DocEng 2015, Lausanne, Switzerland, September 8-11, 2015 2015

2013

  1. Document engineering education: workshop report
    Nicholas, Charles K., and Munson, Ethan V.
    SIGWEB Newsl. 2013
  2. Change-link 2.0: a digital forensic tool for visualizing changes to shadow volume data
    Leschke, Timothy R., and Nicholas, Charles K.
    In 10th Workshop on Visualization for Cyber Security, VizSec 2013, Atlanta, GA, USA, October 14, 2013 2013

2009

  1. Translation Corpus Source and Size in Bilingual Retrieval
    McNamee, Paul, Mayfield, James, and Nicholas, Charles K.
    In Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31 - June 5, 2009, Boulder, Colorado, USA, Short Papers 2009
  2. Addressing morphological variation in alphabetic languages
    McNamee, Paul, Nicholas, Charles K., and Mayfield, James
    In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, Boston, MA, USA, July 19-23, 2009 2009

2008

  1. Topological analysis of an online social network for older adults
    Wilson, Marcella, and Nicholas, Charles K.
    In Proceeding of the 2008 ACM Workshop on Search in Social Media, SSM 2008, Napa Valley, California, USA, October 30, 2008 2008
  2. Don’t have a stemmer?: be un+concern+ed
    McNamee, Paul, Nicholas, Charles K., and Mayfield, James
    In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, Singapore, July 20-24, 2008 2008

2007

  1. Building initial partitions through sampling techniques
    Volkovich, Vladimir, Kogan, Jacob, and Nicholas, Charles K.
    Eur. J. Oper. Res. 2007

2006

  1. Sampling Methods for Building Initial Partitions
    Volkovich, Zeev, Kogan, Jacob, and Nicholas, Charles K.
    2006
  2. Grouping Multidimensional Data - Recent Advances in Clustering
    2006

2005

  1. Data Driven Similarity Measures for k-Means Like Clustering Algorithms
    Kogan, Jacob, Teboulle, Marc, and Nicholas, Charles K.
    Inf. Retr. 2005

2004

  1. Finding aliases on the web using latent semantic analysis
    Bhat, Vinay, Oates, Tim, Shanbhag, Vishal, and Nicholas, Charles K.
    Data Knowl. Eng. 2004

2003

  1. Text mining with information-theoretic clustering
    Kogan, Jacob, Nicholas, Charles K., and Volkovich, Vladimir
    Comput. Sci. Eng. 2003
  2. UMBC at TREC 12
    Kallurkar, Srikanth, Shi, Yongmei, Cost, R. Scott, Nicholas, Charles K., Java, Akshay, James, Christopher, Rajavaram, Sowjanya, Shanbhag, Vishal, Bhatkar, Sachin, and Ogle, Drew
    In Proceedings of The Twelfth Text REtrieval Conference, TREC 2003, Gaithersburg, Maryland, USA, November 18-21, 2003 2003

2002

  1. ITtalks: A Case Study in the Semantic Web and DAML+OIL
    Cost, R. Scott, Finin, Timothy W., Joshi, Anupam, Peng, Yun, Nicholas, Charles K., Soboroff, Ian, Chen, Harry, Kagal, Lalana, Perich, Filip, Zou, Youyong, and Tolia, Sovrin
    IEEE Intell. Syst. 2002
  2. Related, but not Relevant: Content-Based Collaborative Filtering in TREC-8
    Soboroff, Ian, and Nicholas, Charles K.
    Inf. Retr. 2002
  3. Integrating Distributed Information Sources with CARROT II
    Cost, R. Scott, Kallurkar, Srikanth, Majithia, Hemali, Nicholas, Charles K., and Shi, Yongmei
    In Cooperative Information Agents VI, 6th International Workshop, CIA 2002, Madrid, Spain, September 18-20, 2002, Proceedings 2002
  4. CARROTT 11 and the TREC 11 Web Track
    Cost, R. Scott, Kallurkar, Srikanth, Majithia, Hemali, Nicholas, Charles K., and Shi, Yongmei
    In Proceedings of The Eleventh Text REtrieval Conference, TREC 2002, Gaithersburg, Maryland, USA, November 19-22, 2002 2002
  5. Agents Making Sense of the Semantic Web
    Kagal, Lalana, Perich, Filip, Chen, Harry, Tolia, Sovrin, Zou, Youyong, Finin, Timothy W., Joshi, Anupam, Peng, Yun, Cost, R. Scott, and Nicholas, Charles K.
    In Innovative Concepts for Agent-Based Systems, First International Workshop on Radical Agent Concepts, WRAC 2002, McLean, VA, USA, January 16-18, 2002, Revised Papers 2002

2001

  1. Ranking Retrieval Systems without Relevance Judgments
    Soboroff, Ian, Nicholas, Charles, and Cahan, Patrick
    In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2001
  2. ITTALKS: An Application of Agents in the Semantic Web
    Perich, Filip, Kagal, Lalana, Chen, Harry, Tolia, Sovrin, Zou, Youyong, Finin, Timothy W., Joshi, Anupam, Peng, Yun, Cost, R. Scott, and Nicholas, Charles K.
    In Engineering Societies in the Agents World II, Second International Workshop, ESAW 2001, Prague, Czech Republic, July 7, 2001, Revised Papers 2001
  3. ITTALKS: A Case Study in the Semantic Web and DAML
    Cost, R. Scott, Finin, Timothy W., Joshi, Anupam, Peng, Yun, Nicholas, Charles K., Chen, Harry, Kagal, Lalana, Perich, Filip, Zou, Youyong, and Tolia, Sovrin
    In Proceedings of SWWS’01, The first Semantic Web Working Symposium, Stanford University, California, USA, July 30 - August 1, 2001 2001
  4. Ranking Retrieval Systems without Relevance Judgments
    Soboroff, Ian, Nicholas, Charles K., and Cahan, Patrick
    In SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, September 9-13, 2001, New Orleans, Louisiana, USA 2001
  5. Case Study: Visualization and Information Retrieval Techniques for Network Intrusion Detection
    Atkison, Travis, Pensy, Kathleen, Nicholas, Charles K., Ebert, David S., Atkison, Rebekah, and Morris, Chris
    In 3rd Joint Eurographics - IEEE TCVG Symposium on Visualization, VisSym 2001, Ascona, Switzerland, May 28-30, 2001 2001

2000

  1. Performance and Scalability of a Large-Scale N-gram Based Information Retrieval System
    Millar, Ethan, Shen, Dan, Liu, Junli, and Nicholas, Charles K.
    J. Digit. Inf. 2000
  2. Collaborative filtering and the generalized vector space model
    Soboroff, Ian, and Nicholas, Charles K.
    In SIGIR 2000: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 24-28, 2000, Athens, Greece 2000

1999

  1. Interactive Volumetric Information Visualization for Document Corpus Management
    Shaw, Christopher D., Kukla, James M., Soboroff, Ian, Ebert, David S., Nicholas, Charles K., Zwa, Amen, Miller, Ethan L., and Roberts, D. Aaron
    Int. J. Digit. Libr. 1999
  2. Workshop on Recommender Systems: Algorithms and Evaluation
    Soboroff, Ian, Nicholas, Charles K., and Pazzani, Michael J.
    SIGIR Forum 1999
  3. Techniques for Gigabyte-Scale N-gram Based Information Retrieval on Personal Computers
    Miller, Ethan L., Shen, Dan, Liu, Junli, Nicholas, Charles K., and Chen, Ting
    In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA 1999, June 28 - Junlly 1, 1999, Las Vegas, Nevada, USA 1999

1998

  1. Spotting Topics with the Singular Value Decomposition
    Nicholas, Charles K., and Dahlberg, Randall
    In Principles of Digital Document Processing, 4th International Workshop, PODDP’98, Saint Malo, France, March 29-30, 1998, Proceedings 1998

1997

  1. TKQML: A Scripting Tool for Building Agents
    Cost, R. Scott, Soboroff, Ian, Lakhani, Jeegar, Finin, Timothy W., Miller, Ethan L., and Nicholas, Charles K.
    In Intelligent Agents IV, Agent Theories, Architectures, and Languages, 4th International Workshop, ATAL ’97, Providence, Rhode Island, USA, July 24-26, 1997, Proceedings 1997
  2. Visualizing Document Authorship Using n-grams and Latent Semantic Indexing
    Soboroff, Ian, Nicholas, Charles K., Kukla, James M., and Ebert, David S.
    In Proceedings of the Workshop on New Paradigms in Information Visualization and Manipulation (NIPV ’97), in conjuction with CIKM ’97, November 10-14, 1997, Las Vegas, NV, USA 1997
  3. Agent Development Support for Tcl
    Cost, R. Scott, Soboroff, Ian, Lakhani, Jeegar, Finin, Tim, Miller, Ethan L., and Nicholas, Charles K.
    In Proceedings of the Fifth Annual Tcl/Tk Workshop 1997, Boston, Massachusetts, USA, July 14-17, 1997 1997

1996

  1. TELLTALE: Experiments in a Dynamic Hypertext Environment for Degraded and Multilingual Data
    Pearce, Claudia, and Nicholas, Charles K.
    J. Am. Soc. Inf. Sci. 1996

1995

  1. Reliability of WWW Name Servers
    Rowe, Kenneth E., and Nicholas, Charles K.
    Comput. Networks ISDN Syst. 1995

1993

  1. Canto: a Hypertext Data Model
    Nicholas, Charles K., and Rosenberg, Linda H.
    Electron. Publ. 1993
  2. Information and Knowledge Management: Guest Editors’ Introduction
    Nicholas, Charles K., and Yesha, Yelena
    Int. J. Cooperative Inf. Syst. 1993
  3. Snitch: Augmenting Hypertext Documents with a Semantic Net
    Mayfield, James, and Nicholas, Charles K.
    Int. J. Cooperative Inf. Syst. 1993
  4. Generating a Dynamic Hypertext Environment with n-gram Analysis
    Pearce, Claudia, and Nicholas, Charles K.
    In CIKM 93, Proceedings of the Second International Conference on Information and Knowledge Management, Washington, DC, USA, November 1-5, 1993 1993

1992

  1. On the Interchangeability of SGML and ODA
    Nicholas, Charles K., and Welsch, Lawrence A.
    Electron. Publ. 1992