I am a Ph.D. candidate at Korea Advanced Institute of Science and Technology (KAIST), advised by Prof. Jinwoo Shin. During the study, I was fortunate to intern at Amazon Web Services (AWS) (Seattle, WA) twice, in 2022 and 2021. I am also a recipient of Qualcomm Innovation Fellowship Korea 2020 from two of my papers. Previously, I received a B.S. in Mathematics and Computer Science from KAIST in 2017.

I am broadly interested in discovering (if exist) simple priors that would close the gap between neural network and human perception. Many topics are related, particularly on (but not limited to) robustness (or generalization) against distribution shifts, e.g., adversarial examples, natural corruptions, out-of-distribution, and label shifts, to name a few. Ultimately, my research aims to understand why neural networks behave so differently from our brain, and how our brain makes such reliable yet efficient inferences.

Email: jongheonj (at) kaist dot ac dot kr

# News

• Jan 2023: Our paper “Guiding Energy-based Models via Contrastive Latent Variables” is accepted at ICLR 2023 as a notable-top-25% paper (Spotlight presentation).
• Nov 2022: Our paper “Confidence-aware Training of Smoothed Classifiers for Certified Robustness” will be presented at AAAI 2023 as an Oral presentation.
• Sept 2022: Our paper “NOTE: Robust Continual Test-time Adaptation Against Temporal Correlation” will be presented at NeurIPS 2022.
• Sept 2022: I will re-join AWS AI (Seattle, WA) as a returning intern and work until December.
• Aug 2022: Three papers will be presented at ECCV Workshop 2022.
• July 2022: Our paper “SPot-the-Difference Self-Supervised Pre-training for Anomaly Detection and Segmentation” will be presented at ECCV 2022.

# Publications

(*: Equal contribution, C: Conference, W: Workshop, P: Preprint)

## 2023

### [C11] Guiding Energy-based Models via Contrastive Latent Variables

Hankook Lee, Jongheon Jeong, Sejun Park, Jinwoo Shin

International Conference on Learning Representations (ICLR; Spotlight presentation), 2023

tl;dr: A simple yet effective framework for improving EBMs via contrastive representation learning. 

• Also appeared NeurIPS Workshop on Self-Supervised Learning 2022 as an Oral presentation

#### Abstract

An energy-based model (EBM) is a popular generative framework that offers both explicit density and architectural flexibility, but training them is difficult since it is often unstable and time-consuming. In recent years, various training techniques have been developed, e.g., better divergence measures or stabilization in MCMC sampling, but there often exists a large gap between EBMs and other generative frameworks like GANs in terms of generation quality. In this paper, we propose a novel and effective framework for improving EBMs via contrastive representation learning (CRL). To be specific, we consider representations learned by contrastive methods as the true underlying latent variable. This contrastive latent variable could guide EBMs to understand the data structure better, so it can improve and accelerate EBM training significantly. To enable the joint training of EBM and CRL, we also design a new class of latent-variable EBMs for learning the joint density of data and the contrastive latent variable. Our experimental results demonstrate that our scheme achieves lower FID scores, compared to prior-art EBM methods (e.g., additionally using variational autoencoders or diffusion techniques), even with significantly faster and more memory-efficient training. We also show conditional and compositional generation abilities of our latent-variable EBMs as their additional benefits, even without explicit conditional training.

### [C10] Confidence-aware Training of Smoothed Classifiers for Certified Robustness

Jongheon Jeong*, Seojin Kim*, Jinwoo Shin

AAAI Conference on Artificial Intelligence (AAAI; Oral presentation), 2023

tl;dr: A more sensible training method for randomized smoothing by incorporating a sample-wise control of target robustness. 

• Also appeared at ECCV AROW Workshop 2022

#### Abstract

Any classifier can be "smoothed out" under Gaussian noise to build a new classifier that is provably robust to l2-adversarial perturbations, viz., by averaging its predictions over the noise via randomized smoothing. In this paper, we propose a simple training method leveraging the fundamental trade-off between accuracy and (adversarial) robustness to obtain more robust smoothed classifiers, in particular, through a sample-wise control of robustness over the training samples. We make this control feasible by using "accuracy under Gaussian noise" as an easy-to-compute proxy of adversarial robustness for an input: specifically, we differentiate the training objective depending on this proxy to filter out samples that are unlikely to benefit from the worst-case (adversarial) objective. Our experiments show that the proposed method, despite its simplicity, consistently exhibits improved certified robustness upon state-of-the-art training methods. Somewhat surprisingly, we find these improvements persist even for other notions of robustness, e.g., to various types of common corruptions.

## 2022

### [C9] NOTE: Robust Continual Test-time Adaptation Against Temporal Correlation

Taesik Gong, Jongheon Jeong, Taewon Kim, Yewon Kim, Jinwoo Shin, Sung-Ju Lee

Neural Information Processing Systems (NeurIPS), 2022

tl;dr: The first test-time adaptation method concerning temporally correlated data. 

#### Abstract

Test-time adaptation (TTA) is an emerging paradigm that addresses distributional shifts between training and testing phases without additional data acquisition or labeling cost; only unlabeled test data streams are used for continual model adaptation. Previous TTA schemes assume that the test samples are independent and identically distributed (i.i.d.), even though they are often temporally correlated (non-i.i.d.) in application scenarios, e.g., autonomous driving. We discover that most existing TTA methods fail dramatically under such scenarios. Motivated by this, we present a new test-time adaptation scheme that is robust against non-i.i.d. test data streams. Our novelty is mainly two-fold: (a) Instance-Aware Batch Normalization (IABN) that corrects normalization for out-of-distribution samples, and (b) Prediction-balanced Reservoir Sampling (PBRS) that simulates i.i.d. data stream from non-i.i.d. stream in a class-balanced manner. Our evaluation with various datasets, including real-world non-i.i.d. streams, demonstrates that the proposed robust TTA not only outperforms state-of-the-art TTA algorithms in the non-i.i.d. setting, but also achieves comparable performance to those algorithms under the i.i.d. assumption.

### [W6] Learning Robust Representations via Nuisance-extended Information Bottleneck

Jongheon Jeong, Sihyun Yu, Hankook Lee, Jinwoo Shin

ECCV Workshop on Out-of-distribution Generalization in Computer Vision (OOD-CV), 2022

tl;dr: We propose to extend Information Bottleneck with nuisance variable for out-of-distribution generalization. 

#### Abstract

The information bottleneck (IB) principle is one of natural approaches to obtain a succinct representation x -> z for a given downstream task x -> y: namely, it finds z that (a) maximizes the (task-relevant) mutual information I(z; y), while (b) minimizing I(x; z) to constrain the capacity of z for better generalization. In practical scenarios where the training data is limited, however, the IB objective may not be able to prevent z from co-adapting on so-called "shortcut" signal, i.e., features only in training data those are predictive-yet-compressible enough. They are typically from biases in data acquisition, and less generalizable under new (but still semantically-aligned) environments. To bypass such a failure mode, we extend the standard framework of IB to also model the nuisance information with respect to z, namely z_n, so that (z, z_n) can reconstruct x: by minimizing I(z_n; y) as well as the IB objective here, z can now encode more diverse y-related signal in x, while disentangling the remainder information from z. Our experimental results show that the representation learned from our proposed training consistently improves various notions of robustness over the standard VIB training without relying on data augmentations, e.g., novelty detection and corruption robustness.

### [W5] OpenCoS: Contrastive Semi-supervised Learning for Handling Open-set Unlabeled Data

Jongjin Park*, Sukmin Yun*, Jongheon Jeong, Jinwoo Shin

ECCV Workshop on Learning from Limited and Imperfect Data (L2ID), 2022

tl;dr: A contrastive learning based framework to enhance semi-SL methods to also utilize “out-of-class” unlabeled samples. 

#### Abstract

Semi-supervised learning (SSL) has been a powerful strategy to incorporate few labels in learning better representations. In this paper, we focus on a practical scenario that one aims to apply SSL when unlabeled data may contain out-of-class samples - those that cannot have one-hot encoded labels from a closed-set of classes in label data, i.e., the unlabeled data is an open-set. Specifically, we introduce OpenCoS, a simple framework for handling this realistic semi-supervised learning scenario based upon a recent framework of self-supervised visual representation learning. We first observe that the out-of-class samples in the open-set unlabeled dataset can be identified effectively via self-supervised contrastive learning. Then, OpenCoS utilizes this information to overcome the failure modes in the existing state-of-the-art semi-supervised methods, by utilizing one-hot pseudo-labels and soft-labels for the identified in- and out-of-class unlabeled data, respectively. Our extensive experimental results show the effectiveness of OpenCoS under the presence of out-of-class samples, fixing up the state-of-the-art semi-supervised methods to be suitable for diverse scenarios involving open-set unlabeled data.

#### BibTeX

@misc{park2021opencos,
title={Open{CoS}: Contrastive Semi-supervised Learning for Handling Open-set Unlabeled Data},
author={Jongjin Park and Sukmin Yun and Jongheon Jeong and Jinwoo Shin},
year={2021},
eprint={2107.08943},
archivePrefix={arXiv},
primaryClass={cs.CV}
}


### [C8] SPot-the-Difference Self-Supervised Pre-training for Anomaly Detection and Segmentation

Yang Zou, Jongheon Jeong, Latha Pemula, Dongqing Zhang, Onkar Dabeer

European Conference on Computer Vision (ECCV), 2022

tl;dr: (a) VisA - a new larger-scale benchmark for industrial anomaly detection and segmentation; (b) a novel self-supervised pre-training method targeting anomaly downstream tasks on the benchmark. 

#### Abstract

Visual anomaly detection is commonly used in industrial quality inspection. In this paper, we present a new dataset as well as a new self-supervised learning method for ImageNet pre-training to improve anomaly detection and segmentation in 1-class and 2-class 5/10/high-shot training setups. We release the Visual Anomaly (VisA) Dataset consisting of 10,821 high-resolution color images (9,621 normal and 1,200 anomalous samples) covering 12 objects in 3 domains, making it the largest industrial anomaly detection dataset to date. Both image and pixel-level labels are provided. We also propose a new self-supervised framework - SPot-the-difference (SPD) - which can regularize contrastive self-supervised pre-training, such as SimSiam, MoCo and SimCLR, to be more suitable for anomaly detection tasks. Our experiments on VisA and MVTec-AD dataset show that SPD consistently improves these contrastive pre-training baselines and even the supervised pre-training. For example, SPD improves Area Under the Precision-Recall curve (AU-PR) for anomaly segmentation by 5.9% and 6.8% over SimSiam and supervised pre-training respectively in the 2-class high-shot regime. We open-source the project at http://github.com/amazon-research/spot-diff.

#### BibTeX

@inproceedings{zou2022spot,
title={SPot-the-Difference Self-supervised Pre-training for Anomaly Detection and Segmentation},
author={Zou, Yang and Jeong, Jongheon and Pemula, Latha and Zhang, Dongqing and Dabeer, Onkar},
booktitle={European Conference on Computer Vision},
pages={392--408},
year={2022},
organization={Springer}
}


### [C7/W] Consistency Regularization for Adversarial Robustness

Jihoon Tack, Sihyun Yu, Jongheon Jeong, Minseon Kim, Sung Ju Hwang, Jinwoo Shin

AAAI Conference on Artificial Intelligence (AAAI), 2022

tl;dr: Consistency regularization can also prevent robustness overfitting in adversarial training. 

• Also appeared ICML AdvML Workshop 2021 as an Oral presentation
• Won the Best Paper Award from Korean Artificial Intelligence Association 2021

#### Abstract

Adversarial training (AT) is currently one of the most successful methods to obtain the adversarial robustness of deep neural networks. However, the phenomenon of robust overfitting, i.e., the robustness starts to decrease significantly during AT, has been problematic, not only making practitioners consider a bag of tricks for a successful training, e.g., early stopping, but also incurring a significant generalization gap in the robustness. In this paper, we propose an effective regularization technique that prevents robust overfitting by optimizing an auxiliary consistency' regularization loss during AT. Specifically, we discover that data augmentation is a quite effective tool to mitigate the overfitting in AT, and develop a regularization that forces the predictive distributions after attacking from two different augmentations of the same instance to be similar with each other. Our experimental results demonstrate that such a simple regularization technique brings significant improvements in the test robust accuracy of a wide range of AT methods. More remarkably, we also show that our method could significantly help the model to generalize its robustness against unseen adversaries, e.g., other types or larger perturbations compared to those used during training. Code is available at https://github.com/alinlab/consistency-adversarial.

#### BibTeX

@inproceedings{tack2022consistency,
author={Jihoon Tack and Sihyun Yu and Jongheon Jeong and Minseon Kim and Sung Ju Hwang and Jinwoo Shin},
booktitle={AAAI Conference on Artificial Intelligence},
year={2022}
}


## 2021

### [C6/W] SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness

Jongheon Jeong, Sejun Park, Minkyu Kim, Heung-Chang Lee, Doguk Kim, Jinwoo Shin

Neural Information Processing Systems (NeurIPS), 2021

tl;dr: Overconfident inputs nearby the data may cause adversarial vulnerability in randomized smoothing, and regularizing them toward the uniform confidence improves robustness. 

• Also appeared at ICML AdvML Workshop 2021

#### Abstract

Randomized smoothing is currently a state-of-the-art method to construct a certifiably robust classifier from neural networks against $\ell_2$-adversarial perturbations. Under the paradigm, the robustness of a classifier is aligned with the prediction confidence, i.e., the higher confidence from a smoothed classifier implies the better robustness. This motivates us to rethink the fundamental trade-off between accuracy and robustness in terms of calibrating confidences of a smoothed classifier. In this paper, we propose a simple training scheme, coined SmoothMix, to control the robustness of smoothed classifiers via self-mixup: it trains on convex combinations of samples along the direction of adversarial perturbation for each input. The proposed procedure effectively identifies over-confident, near off-class samples as a cause of limited robustness in case of smoothed classifiers, and offers an intuitive way to adaptively set a new decision boundary between these samples for better robustness. Our experimental results demonstrate that the proposed method can significantly improve the certified $\ell_2$-robustness of smoothed classifiers compared to existing state-of-the-art robust training methods.

#### BibTeX

@inproceedings{jeong2021smoothmix,
title={Smooth{Mix}: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness},
author={Jongheon Jeong and Sejun Park and Minkyu Kim and Heung-Chang Lee and Doguk Kim and Jinwoo Shin},
booktitle={Advances in Neural Information Processing Systems},
year={2021},
url={https://openreview.net/forum?id=nlEQMVBD359}
}


### [C5] Training GANs with Stronger Augmentations via Contrastive Discriminator

Jongheon Jeong, Jinwoo Shin

International Conference on Learning Representations (ICLR), 2021

tl;dr: We propose a novel discriminator of GAN showing that contrastive representation learning, e.g., SimCLR, and GAN can benefit each other when they are jointly trained. 

#### Abstract

Recent works in Generative Adversarial Networks (GANs) are actively revisiting various data augmentation techniques as an effective way to prevent discriminator overfitting. It is still unclear, however, that which augmentations could actually improve GANs, and in particular, how to apply a wider range of augmentations in training. In this paper, we propose a novel way to address these questions by incorporating a recent contrastive representation learning scheme into the GAN discriminator, coined ContraD. This "fusion" enables the discriminators to work with much stronger augmentations without increasing their training instability, thereby preventing the discriminator overfitting issue in GANs more effectively. Even better, we observe that the contrastive learning itself also benefits from our GAN training, i.e., by maintaining discriminative features between real and fake samples, suggesting a strong coherence between the two worlds: good contrastive representations are also good for GAN discriminators, and vice versa. Our experimental results show that GANs with ContraD consistently improve FID and IS compared to other recent techniques incorporating data augmentations, still maintaining highly discriminative features in the discriminator in terms of the linear evaluation. Finally, as a byproduct, we also show that our GANs trained in an unsupervised manner (without labels) can induce many conditional generative models via a simple latent sampling, leveraging the learned features of ContraD. Code is available at https://github.com/jh-jeong/ContraD.

#### BibTeX

@inproceedings{jeong2021training,
title={Training {GAN}s with Stronger Augmentations via Contrastive Discriminator},
author={Jongheon Jeong and Jinwoo Shin},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=eo6U4CAwVmg}
}


## 2020

### [C4/W] Consistency Regularization for Certified Robustness of Smoothed Classifiers

Jongheon Jeong, Jinwoo Shin

Neural Information Processing Systems (NeurIPS), 2020

tl;dr: Consistency controls robustness in the world of randomized smoothing, like TRADES in adversarial training. 

• Also appeared at ICML UDL Workshop 2020
• Won Qualcomm Innovation Fellowship Korea 2020

#### Abstract

A recent technique of randomized smoothing has shown that the worst-case (adversarial) $\ell_2$-robustness can be transformed into the average-case Gaussian-robustness by "smoothing" a classifier, i.e., by considering the averaged prediction over Gaussian noise. In this paradigm, one should rethink the notion of adversarial robustness in terms of generalization ability of a classifier under noisy observations. We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise. This relationship allows us to design a robust training objective without approximating a non-existing smoothed classifier, e.g., via soft smoothing. Our experiments under various deep neural network architectures and datasets show that the "certified" $\ell_2$-robustness can be dramatically improved with the proposed regularization, even achieving better or comparable results to the state-of-the-art approaches with significantly less training costs and hyperparameters.

#### BibTeX

@inproceedings{jeong2020consistency,
author = {Jeong, Jongheon and Shin, Jinwoo},
booktitle = {Advances in Neural Information Processing Systems},
editor = {H. Larochelle and M. Ranzato and R. Hadsell and M.F. Balcan and H. Lin},
pages = {10558--10570},
publisher = {Curran Associates, Inc.},
title = {Consistency Regularization for Certified Robustness of Smoothed Classifiers},
url = {https://proceedings.neurips.cc/paper/2020/file/77330e1330ae2b086e5bfcae50d9ffae-Paper.pdf},
volume = {33},
year = {2020}
}


### [C3] CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted Instances

Jihoon Tack*, Sangwoo Mo*, Jongheon Jeong, Jinwoo Shin

Neural Information Processing Systems (NeurIPS), 2020

tl;dr: Contrastive representations are surprisingly good at discriminating OOD samples, and contrasting also “OOD-like” augmentations can further improve their performances. 

• Won Qualcomm Innovation Fellowship Korea 2020

#### Abstract

Novelty detection, i.e., identifying whether a given sample is drawn from outside the training distribution, is essential for reliable machine learning. To this end, there have been many attempts at learning a representation well-suited for novelty detection and designing a score based on such representation. In this paper, we propose a simple, yet effective method named contrasting shifted instances (CSI), inspired by the recent success on contrastive learning of visual representations. Specifically, in addition to contrasting a given sample with other instances as in conventional contrastive learning methods, our training scheme contrasts the sample with distributionally-shifted augmentations of itself. Based on this, we propose a new detection score that is specific to the proposed training scheme. Our experiments demonstrate the superiority of our method under various novelty detection scenarios, including unlabeled one-class, unlabeled multi-class and labeled multi-class settings, with various image benchmark datasets. Code and pre-trained models are available at https://github.com/alinlab/CSI.

#### BibTeX

@inproceedings{tack2020csi,
author = {Tack, Jihoon and Mo, Sangwoo and Jeong, Jongheon and Shin, Jinwoo},
booktitle = {Advances in Neural Information Processing Systems},
editor = {H. Larochelle and M. Ranzato and R. Hadsell and M.F. Balcan and H. Lin},
pages = {11839--11852},
publisher = {Curran Associates, Inc.},
title = {CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted Instances},
url = {https://proceedings.neurips.cc/paper/2020/file/8965f76632d7672e7d3cf29c87ecaa0c-Paper.pdf},
volume = {33},
year = {2020}
}


### [C2] M2m: Imbalanced Classification via Major-to-minor Translation

Jaehyung Kim*, Jongheon Jeong*, Jinwoo Shin

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020

tl;dr: Adversarial examples targeting Majority -> minority can play as surprisingly effective minority samples to prevent overfitting under class-imbalance. 

#### Abstract

In most real-world scenarios, labeled training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion. In this paper, we explore a novel yet simple way to alleviate this issue by augmenting less-frequent classes via translating samples (e.g., images) from more-frequent classes. This simple approach enables a classifier to learn more generalizable features of minority classes, by transferring and leveraging the diversity of the majority information. Our experimental results on a variety of class-imbalanced datasets show that the proposed method improves the generalization on minority classes significantly compared to other existing re-sampling or re-weighting methods. The performance of our method even surpasses those of previous state-of-the-art methods for the imbalanced classification.

#### BibTeX

@InProceedings{kim2020M2m,
author = {Kim, Jaehyung and Jeong, Jongheon and Shin, Jinwoo},
title = {M2m: Imbalanced Classification via Major-to-Minor Translation},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
month = {June},
year = {2020}
}


## 2019

### [C1] Training CNNs with Selective Allocation of Channels

Jongheon Jeong, Jinwoo Shin

International Conference on Machine Learning (ICML), 2019

tl;dr: Any CNNs can become more efficient by “re-allocating” unnecessary channels to increase the kernel size. 

#### Abstract

Recent progress in deep convolutional neural networks (CNNs) have enabled a simple paradigm of architecture design: larger models typically achieve better accuracy. Due to this, in modern CNN architectures, it becomes more important to design models that generalize well under certain resource constraints, e.g. the number of parameters. In this paper, we propose a simple way to improve the capacity of any CNN model having large-scale features, without adding more parameters. In particular, we modify a standard convolutional layer to have a new functionality of channel-selectivity, so that the layer is trained to select important channels to re-distribute their parameters. Our experimental results under various CNN architectures and datasets demonstrate that the proposed new convolutional layer allows new optima that generalize better via efficient resource utilization, compared to the baseline.

#### BibTeX

@InProceedings{jeong2020training,
title = 	 {Training {CNN}s with Selective Allocation of Channels},
author =       {Jeong, Jongheon and Shin, Jinwoo},
booktitle = 	 {Proceedings of the 36th International Conference on Machine Learning},
pages = 	 {3080--3090},
year = 	 {2019},
editor = 	 {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
volume = 	 {97},
series = 	 {Proceedings of Machine Learning Research},
month = 	 {09--15 Jun},
publisher =    {PMLR},
pdf = 	 {http://proceedings.mlr.press/v97/jeong19c/jeong19c.pdf},
url = 	 {https://proceedings.mlr.press/v97/jeong19c.html}
}
`

## 2016

### [W1] AutoML Challenge: AutoML Framework Using Random Space Partitioning Optimizer

Jungtaek Kim, Jongheon Jeong, Seungjin Choi

ICML Workshop on Automatic Machine Learning (AutoML), 2016

# Professional Services

• Conference reviewers
• Neural Information Processing Systems (NeurIPS)
• International Conference on Learning Representations (ICLR)
• International Conference on Machine Learning (ICML)
• AAAI Conference on Artificial Intelligence (AAAI)
• IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
• Journal reviewers
• International Journal of Computer Vision (IJCV)
• Transactions on Machine Learning Research (TMLR)
• ACM Transactions on Modeling and Performance Evaluation of Computing Systems (ACM ToMPECS)