publications

publications by categories in reversed chronological order.

2025

    2024

    1. Ransomware Evolution: Unveiling Patterns Using HDBSCAN
      Bhandary, Prajna, Joyce, Robert J, and Nicholas, Charles
      In 2024

    2023

    1. MalDICT: Benchmark Datasets on Malware Behaviors, Platforms, Exploitation, and Packers
      Joyce, Robert J., Raff, Edward, Nicholas, Charles, and Holt, James
      2023

    2022

    1. FedSPLIT: One-Shot Federated Recommendation System Based on Non-negative Joint Matrix Factorization and Knowledge Distillation
      Eren, Maksim E, Richards, Luke E, Bhattarai, Manish, Yus, Roberto, Nicholas, Charles, and Alexandrov, Boian S
      arXiv preprint arXiv:2205.02359 2022
    2. General-Purpose Unsupervised Cyber Anomaly Detection via Non-Negative Tensor Factorization
      Eren, Maksim Ekin, Moore, Juston, Skau, Erik, Bhattarai, Manish, Moore, Elisabeth, Chennupati, Gopinath, and Alexandrov, Boian
      Digital Threats: Research and Practice 2022

    2021

    1. MOTIF: A Large Malware Reference Dataset with Ground Truth Family Labels
      Joyce, Robert J., Amlani, Dev, Nicholas, Charles, and Raff, Edward
      2021
    2. A Framework for Cluster and Classifier Evaluation in the Absence of Reference Labels
      Joyce, Robert J., Raff, Edward, and Nicholas, Charles
      Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security 2021
    3. Rank-1 Similarity Matrix Decomposition For Modeling Changes in Antivirus Consensus Through Time
      Joyce, Robert J., Raff, Edward, and Nicholas, Charles
      2021
    4. COVID-19 Multidimensional Kaggle Literature Organization
      Eren, Maksim Ekin, Solovyev, Nick, Hamer, Chris, McDonald, Renee, Alexandrov, Boian, and Nicholas, Charles
      In Proceedings of the ACM Symposium on Document Engineering 2021 2021
    5. Random Forest of Tensors (RFoT)
      Eren, Maksim Ekin, Nicholas, Charles, McDonald, Renee, and Hamer, Chris
      Presented at the 12th Annual Malware Technical Exchange Meeting, Online. 2021
    6. Evading Malware Classifiers via Monte Carlo Mutant Feature Discovery
      Boutsikas, John, Eren, Maksim Ekin, Varga, Charles, Raff, Edward, Matuszek, Cynthia, and Nicholas, Charles
      Presented at the 12th Annual Malware Technical Exchange Meeting, Online. 2021
    7. Bringing UMAP Closer to the Speed of Light with GPU Acceleration
      Nolet, Corey J., Lafargue, Victor, Raff, Edward, Nanditale, Thejaswi, Oates, Tim, Zedlewski, John, and Patterson, Joshua
      In The Thirty-Fifth AAAI Conference on Artificial Intelligence 2021
    8. Research Reproducibility as a Survival Analysis
      Raff, Edward
      In The Thirty-Fifth AAAI Conference on Artificial Intelligence 2021
    9. Classifying Sequences of Extreme Length with Constant Memory Applied to Malware Detection
      Raff, Edward, Fleshman, William, Zak, Richard, Anderson, Hyrum S., Filar, Bobby, and McLean, Mark
      In The Thirty-Fifth AAAI Conference on Artificial Intelligence 2021
    10. Accounting for Variance in Machine Learning Benchmarks
      Bouthillier, Xavier, Delaunay, Pierre, Bronzi, Mirko, Trofimov, Assya, Nichyporuk, Brennan, Szeto, Justin, Sepah, Naz, Raff, Edward, Madan, Kanika, Voleti, Vikram, Kahou, Samira Ebrahimi, Michalski, Vincent, Serdyuk, Dmitriy, Arbel, Tal, Pal, Chris, Varoquaux, Gaël, and Vincent, Pascal
      In Machine Learning and Systems (MLSys) 2021
    11. Exact Acceleration of K-Means ++ and K-Means
      Raff, Edward
      In 30th International Joint Conference on Artificial Intelligence (IJCAI-21) 2021
    12. Generating Thermal Human Faces for Physiological Assessment Using Thermal Sensor Auxiliary Labels
      Ordun, Catherine, Raff, Edward, and Purushotham, Sanjay
      In ICIP 2021
    13. Leveraging Uncertainty for Improved Static Malware Detection Under Extreme False Positive Constraints
      Nguyen, Andre T., Raff, Edward, Nicholas, Charles, and Holt, James
      In IJCAI-21 1st International Workshop on Adaptive Cyber Defense 2021

    2020

    1. Multi-Dimensional Anomalous Entity Detection via Poisson Tensor Factorization
      Eren, Maksim Ekin, Moore, Juston, and Alexandrov, Boian
      In 2020 IEEE International Conference on Intelligence and Security Informatics (ISI) 2020
    2. Flexible and Adaptive Fairness-aware Learning in Non-stationary Data Streams
      Zhang, Wenbin, Zhang, Mingli, Zhang, Ji, Liu, Zhen, Chen, Zhiyuan, Wang, Jianwu, Raff, Edward, and Messina, Enza
      In 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI) 2020
    3. Sampling Approach Matters: Active Learning for Robotic Language Acquisition
      Pillai, Nisha, Raff, Edward, Ferraro, Francis, and Matuszek, Cynthia
      In 2020 IEEE International Conference on Big Data (Big Data) 2020
    4. COVID-19 Kaggle Literature Organization
      Eren, Maksim Ekin, Solovyev, Nick, Raff, Edward, Nicholas, Charles, and Johnson, Ben
      In Proceedings of the ACM Symposium on Document Engineering 2020 2020
    5. A Survey of Machine Learning Methods and Challenges for Windows Malware Classification
      In NeurIPS 2020 Workshop: ML Retrospectives, Surveys & Meta-Analyses (ML-RSA) 2020
    6. The Use of AI for Thermal Emotion Recognition: A Review of Problems and Limitations in Standard Design and Data
      Ordun, Catherine, Raff, Edward, and Purushotham, Sanjay
      In AAAI FSS-20: Artificial Intelligence in Government and Public Sector 2020
    7. Robust Design of Deep Neural Networks against Adversarial Attacks based on Lyapunov Theory
      Rahnama, Arash, Nguyen, Andre T., and Raff, Edward
      In The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
    8. Automatic Yara Rule Generation Using Biclustering
      Raff, Edward, Zak, Richard, Munoz, Gary Lopez, Fleming, William, Anderson, Hyrum S., Filar, Bobby, Nicholas, Charles, and Holt, James
      In 13th ACM Workshop on Artificial Intelligence and Security (AISec’20) 2020
    9. A New Burrows Wheeler Transform Markov Distance
      Raff, Edward, Nicholas, Charles, and McLean, Mark
      In The Thirty-Fourth AAAI Conference on Artificial Intelligence 2020
    10. Exploratory Analysis of Covid-19 Tweets using Topic Modeling, UMAP, and DiGraphs
      Ordun, Catherine, Purushotham, Sanjay, and Raff, Edward
      In epiDAMIK 2020: 3rd epiDAMIK ACM SIGKDD International Workshop on Epidemiology meets Data Mining and Knowledge Discovery 2020
    11. Cluster Quality Analysis Using Silhouette Score
      Shahapure, Ketan Rajshekhar, and Nicholas, Charles
      In 7th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2020, Sydney, Australia, October 6-9, 2020 2020
    12. A Quantum Algorithm To Locate Unknown Hashes For Known N-Grams Within A Large Malware Corpus
      Allgood, Nicholas R., and Nicholas, Charles K.
      CoRR 2020

    2019

    1. A Step Toward Quantifying Independently Reproducible Machine Learning Research
      Raff, Edward
      In Advances in Neural Information Processing Systems 2019
    2. PyLZJD: An Easy to Use Tool for Machine Learning
      Raff, Edward, Aurelio, Joe, and Nicholas, Charles
      In Proceedings of the 18th Python in Science Conference 2019
    3. KiloGrams: Very Large N-Grams for Malware Classification
      Raff, Edward, Fleming, William, Zak, Richard, Anderson, Hyrum, Finlayson, Bill, Nicholas, Charles K., and Mclean, Mark
      In Proceedings of KDD 2019 Workshop on Learning and Mining for Cybersecurity (LEMINCS’19) 2019
    4. Barrage of random transforms for adversarially robust defense
      Raff, E., Sylvester, J., Forsyth, S., and McLean, M.
      In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2019
    5. Non-Negative Networks Against Adversarial Attacks
      Fleshman, William, Raff, Edward, Sylvester, Jared, Forsyth, Steven, and McLean, Mark
      AAAI-2019 Workshop on Artificial Intelligence for Cyber Security 2019
    6. Would a File by Any Other Name Seem as Malicious?
      Nguyen, Andre T, Raff, Edward, and Sant-Miller, Aaron
      In 2019 IEEE International Conference on Big Data (Big Data) 2019

    2018

    1. Malware Detection by Eating a Whole EXE
      Raff, Edward, Barker, Jon, Sylvester, Jared, Brandon, Robert, Catanzaro, Bryan, and Nicholas, Charles
      In AAAI Workshop on Artificial Intelligence for Cyber Security 2018
    2. Static Malware Detection & Subterfuge: Quantifying the Robustness of Machine Learning and Current Anti-Virus
      Fleshman, William, Raff, Edward, Zak, Richard, McLean, Mark, and Nicholas, Charles
      In 2018 13th International Conference on Malicious and Unwanted Software (MALWARE) 2018
    3. Engineering a Simplified 0-Bit Consistent Weighted Sampling
      Raff, Edward, Sylvester, Jared, and Nicholas, Charles
      In Proceedings of the 27th ACM International Conference on Information and Knowledge Management 2018
    4. Gradient Reversal Against Discrimination : A Fair Neural Network Learning Approach
      Raff, Edward, and Sylvester, Jared
      In The 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA) 2018
    5. Lempel-Ziv Jaccard Distance, an effective alternative to ssdeep and sdhash
      Digital Investigation 2018
    6. Hash-Grams: Faster N-Gram Features for Classification and Malware Detection
      In Proceedings of the ACM Symposium on Document Engineering 2018 2018

    2017

    1. Learning the PE Header, Malware Detection with Minimal Domain Knowledge
      Raff, Edward, Sylvester, Jared, and Nicholas, Charles
      In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security 2017
    2. What can N-grams learn for malware detection?
      Zak, Richard, Raff, Edward, and Nicholas, Charles
      In 2017 12th International Conference on Malicious and Unwanted Software (MALWARE) 2017
    3. An Alternative to NCD for Large Sequences, Lempel-Ziv Jaccard Distance
      In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’17 2017
    4. JSAT: Java Statistical Analysis Tool, a Library for Machine Learning
      Raff, Edward
      Journal of Machine Learning Research 2017
    5. Malware Classification and Class Imbalance via Stochastic Hashed LZJD
      In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security 2017
    6. Document Engineering Issues in Malware Analysis
      Nicholas, Charles K.
      In Proceedings of the 2017 ACM Symposium on Document Engineering, DocEng 2017, Valletta, Malta, September 4-7, 2017 2017

    2016

    1. An investigation of byte n-gram features for malware classification
      Raff, Edward, Zak, Richard, Cox, Russell, Sylvester, Jared, Yacci, Paul, Ward, Rebecca, Tracy, Anna, McLean, Mark, and Nicholas, Charles
      Journal of Computer Virology and Hacking Techniques 2016
    2. Document Engineering Issues in Malware Analysis
      Nicholas, Charles K., and Brandon, Robert
      In Proceedings of the 2016 ACM Symposium on Document Engineering, DocEng 2016, Vienna, Austria, September 13 - 16, 2016 2016

    2015

    1. Document Engineering Issues in Document Analysis
      Nicholas, Charles K., and Brandon, Robert
      In Proceedings of the 2015 ACM Symposium on Document Engineering, DocEng 2015, Lausanne, Switzerland, September 8-11, 2015 2015

    2013

    1. Document engineering education: workshop report
      Nicholas, Charles K., and Munson, Ethan V.
      SIGWEB Newsl. 2013
    2. Change-link 2.0: a digital forensic tool for visualizing changes to shadow volume data
      Leschke, Timothy R., and Nicholas, Charles K.
      In 10th Workshop on Visualization for Cyber Security, VizSec 2013, Atlanta, GA, USA, October 14, 2013 2013

    2009

    1. Translation Corpus Source and Size in Bilingual Retrieval
      McNamee, Paul, Mayfield, James, and Nicholas, Charles K.
      In Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31 - June 5, 2009, Boulder, Colorado, USA, Short Papers 2009
    2. Addressing morphological variation in alphabetic languages
      McNamee, Paul, Nicholas, Charles K., and Mayfield, James
      In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, Boston, MA, USA, July 19-23, 2009 2009

    2008

    1. Topological analysis of an online social network for older adults
      Wilson, Marcella, and Nicholas, Charles K.
      In Proceeding of the 2008 ACM Workshop on Search in Social Media, SSM 2008, Napa Valley, California, USA, October 30, 2008 2008
    2. Don’t have a stemmer?: be un+concern+ed
      McNamee, Paul, Nicholas, Charles K., and Mayfield, James
      In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, Singapore, July 20-24, 2008 2008

    2007

    1. Building initial partitions through sampling techniques
      Volkovich, Vladimir, Kogan, Jacob, and Nicholas, Charles K.
      Eur. J. Oper. Res. 2007

    2006

    1. Sampling Methods for Building Initial Partitions
      Volkovich, Zeev, Kogan, Jacob, and Nicholas, Charles K.
      2006
    2. Grouping Multidimensional Data - Recent Advances in Clustering
      2006

    2005

    1. Data Driven Similarity Measures for k-Means Like Clustering Algorithms
      Kogan, Jacob, Teboulle, Marc, and Nicholas, Charles K.
      Inf. Retr. 2005

    2004

    1. Finding aliases on the web using latent semantic analysis
      Bhat, Vinay, Oates, Tim, Shanbhag, Vishal, and Nicholas, Charles K.
      Data Knowl. Eng. 2004

    2003

    1. Text mining with information-theoretic clustering
      Kogan, Jacob, Nicholas, Charles K., and Volkovich, Vladimir
      Comput. Sci. Eng. 2003
    2. UMBC at TREC 12
      Kallurkar, Srikanth, Shi, Yongmei, Cost, R. Scott, Nicholas, Charles K., Java, Akshay, James, Christopher, Rajavaram, Sowjanya, Shanbhag, Vishal, Bhatkar, Sachin, and Ogle, Drew
      In Proceedings of The Twelfth Text REtrieval Conference, TREC 2003, Gaithersburg, Maryland, USA, November 18-21, 2003 2003

    2002

    1. ITtalks: A Case Study in the Semantic Web and DAML+OIL
      Cost, R. Scott, Finin, Timothy W., Joshi, Anupam, Peng, Yun, Nicholas, Charles K., Soboroff, Ian, Chen, Harry, Kagal, Lalana, Perich, Filip, Zou, Youyong, and Tolia, Sovrin
      IEEE Intell. Syst. 2002
    2. Related, but not Relevant: Content-Based Collaborative Filtering in TREC-8
      Soboroff, Ian, and Nicholas, Charles K.
      Inf. Retr. 2002
    3. Integrating Distributed Information Sources with CARROT II
      Cost, R. Scott, Kallurkar, Srikanth, Majithia, Hemali, Nicholas, Charles K., and Shi, Yongmei
      In Cooperative Information Agents VI, 6th International Workshop, CIA 2002, Madrid, Spain, September 18-20, 2002, Proceedings 2002
    4. CARROTT 11 and the TREC 11 Web Track
      Cost, R. Scott, Kallurkar, Srikanth, Majithia, Hemali, Nicholas, Charles K., and Shi, Yongmei
      In Proceedings of The Eleventh Text REtrieval Conference, TREC 2002, Gaithersburg, Maryland, USA, November 19-22, 2002 2002
    5. Agents Making Sense of the Semantic Web
      Kagal, Lalana, Perich, Filip, Chen, Harry, Tolia, Sovrin, Zou, Youyong, Finin, Timothy W., Joshi, Anupam, Peng, Yun, Cost, R. Scott, and Nicholas, Charles K.
      In Innovative Concepts for Agent-Based Systems, First International Workshop on Radical Agent Concepts, WRAC 2002, McLean, VA, USA, January 16-18, 2002, Revised Papers 2002

    2001

    1. Ranking Retrieval Systems without Relevance Judgments
      Soboroff, Ian, Nicholas, Charles, and Cahan, Patrick
      In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2001
    2. ITTALKS: An Application of Agents in the Semantic Web
      Perich, Filip, Kagal, Lalana, Chen, Harry, Tolia, Sovrin, Zou, Youyong, Finin, Timothy W., Joshi, Anupam, Peng, Yun, Cost, R. Scott, and Nicholas, Charles K.
      In Engineering Societies in the Agents World II, Second International Workshop, ESAW 2001, Prague, Czech Republic, July 7, 2001, Revised Papers 2001
    3. ITTALKS: A Case Study in the Semantic Web and DAML
      Cost, R. Scott, Finin, Timothy W., Joshi, Anupam, Peng, Yun, Nicholas, Charles K., Chen, Harry, Kagal, Lalana, Perich, Filip, Zou, Youyong, and Tolia, Sovrin
      In Proceedings of SWWS’01, The first Semantic Web Working Symposium, Stanford University, California, USA, July 30 - August 1, 2001 2001
    4. Ranking Retrieval Systems without Relevance Judgments
      Soboroff, Ian, Nicholas, Charles K., and Cahan, Patrick
      In SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, September 9-13, 2001, New Orleans, Louisiana, USA 2001
    5. Case Study: Visualization and Information Retrieval Techniques for Network Intrusion Detection
      Atkison, Travis, Pensy, Kathleen, Nicholas, Charles K., Ebert, David S., Atkison, Rebekah, and Morris, Chris
      In 3rd Joint Eurographics - IEEE TCVG Symposium on Visualization, VisSym 2001, Ascona, Switzerland, May 28-30, 2001 2001

    2000

    1. Performance and Scalability of a Large-Scale N-gram Based Information Retrieval System
      Millar, Ethan, Shen, Dan, Liu, Junli, and Nicholas, Charles K.
      J. Digit. Inf. 2000
    2. Collaborative filtering and the generalized vector space model
      Soboroff, Ian, and Nicholas, Charles K.
      In SIGIR 2000: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 24-28, 2000, Athens, Greece 2000

    1999

    1. Interactive Volumetric Information Visualization for Document Corpus Management
      Shaw, Christopher D., Kukla, James M., Soboroff, Ian, Ebert, David S., Nicholas, Charles K., Zwa, Amen, Miller, Ethan L., and Roberts, D. Aaron
      Int. J. Digit. Libr. 1999
    2. Workshop on Recommender Systems: Algorithms and Evaluation
      Soboroff, Ian, Nicholas, Charles K., and Pazzani, Michael J.
      SIGIR Forum 1999
    3. Techniques for Gigabyte-Scale N-gram Based Information Retrieval on Personal Computers
      Miller, Ethan L., Shen, Dan, Liu, Junli, Nicholas, Charles K., and Chen, Ting
      In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA 1999, June 28 - Junlly 1, 1999, Las Vegas, Nevada, USA 1999

    1998

    1. Spotting Topics with the Singular Value Decomposition
      Nicholas, Charles K., and Dahlberg, Randall
      In Principles of Digital Document Processing, 4th International Workshop, PODDP’98, Saint Malo, France, March 29-30, 1998, Proceedings 1998

    1997

    1. TKQML: A Scripting Tool for Building Agents
      Cost, R. Scott, Soboroff, Ian, Lakhani, Jeegar, Finin, Timothy W., Miller, Ethan L., and Nicholas, Charles K.
      In Intelligent Agents IV, Agent Theories, Architectures, and Languages, 4th International Workshop, ATAL ’97, Providence, Rhode Island, USA, July 24-26, 1997, Proceedings 1997
    2. Visualizing Document Authorship Using n-grams and Latent Semantic Indexing
      Soboroff, Ian, Nicholas, Charles K., Kukla, James M., and Ebert, David S.
      In Proceedings of the Workshop on New Paradigms in Information Visualization and Manipulation (NIPV ’97), in conjuction with CIKM ’97, November 10-14, 1997, Las Vegas, NV, USA 1997
    3. Agent Development Support for Tcl
      Cost, R. Scott, Soboroff, Ian, Lakhani, Jeegar, Finin, Tim, Miller, Ethan L., and Nicholas, Charles K.
      In Proceedings of the Fifth Annual Tcl/Tk Workshop 1997, Boston, Massachusetts, USA, July 14-17, 1997 1997

    1996

    1. TELLTALE: Experiments in a Dynamic Hypertext Environment for Degraded and Multilingual Data
      Pearce, Claudia, and Nicholas, Charles K.
      J. Am. Soc. Inf. Sci. 1996

    1995

    1. Reliability of WWW Name Servers
      Rowe, Kenneth E., and Nicholas, Charles K.
      Comput. Networks ISDN Syst. 1995

    1993

    1. Canto: a Hypertext Data Model
      Nicholas, Charles K., and Rosenberg, Linda H.
      Electron. Publ. 1993
    2. Information and Knowledge Management: Guest Editors’ Introduction
      Nicholas, Charles K., and Yesha, Yelena
      Int. J. Cooperative Inf. Syst. 1993
    3. Snitch: Augmenting Hypertext Documents with a Semantic Net
      Mayfield, James, and Nicholas, Charles K.
      Int. J. Cooperative Inf. Syst. 1993
    4. Generating a Dynamic Hypertext Environment with n-gram Analysis
      Pearce, Claudia, and Nicholas, Charles K.
      In CIKM 93, Proceedings of the Second International Conference on Information and Knowledge Management, Washington, DC, USA, November 1-5, 1993 1993

    1992

    1. On the Interchangeability of SGML and ODA
      Nicholas, Charles K., and Welsch, Lawrence A.
      Electron. Publ. 1992