I am a PhD student at Algorithmic Intelligence Laboratory (ALIN Lab) of KAIST, advised by Prof. Jinwoo Shin. During the study, I interned at AWS AI twice (Seattle, WA, 2021/2022). I received a B.S. in Mathematics and Computer Science from KAIST in 2017. I am a recepient of Qualcomm Innovation Fellowship Korea 2020 from two of my papers.

My research goal is to understand why neural networks behave so differently from our brain, and ultimately, how our brain makes inferences. Specifically, I’m interested in discovering (if exist) simple priors that would close the gap between neural network and human perception. Many topics are related, particularly on robustness (or generalization) to distribution shifts, e.g., adversarial, out-of-distribution, and label shifts, just to name a few.

Email: jongheonj (at) kaist dot ac dot kr


:page_with_curl:  Publications

(*: Equal contribution, C: Conference, W: Workshop, P: Preprint)

2022

  Code   Slides   Poster

  • Also appeared ICML AdvML Workshop 2021 as an Oral presentation
  • Won the Best Paper Award from Korean Artificial Intelligence Association 2021
Additional information

Abstract

Adversarial training (AT) is currently one of the most successful methods to obtain the adversarial robustness of deep neural networks. However, the phenomenon of robust overfitting, i.e., the robustness starts to decrease significantly during AT, has been problematic, not only making practitioners consider a bag of tricks for a successful training, e.g., early stopping, but also incurring a significant generalization gap in the robustness. In this paper, we propose an effective regularization technique that prevents robust overfitting by optimizing an auxiliary `consistency' regularization loss during AT. Specifically, we discover that data augmentation is a quite effective tool to mitigate the overfitting in AT, and develop a regularization that forces the predictive distributions after attacking from two different augmentations of the same instance to be similar with each other. Our experimental results demonstrate that such a simple regularization technique brings significant improvements in the test robust accuracy of a wide range of AT methods. More remarkably, we also show that our method could significantly help the model to generalize its robustness against unseen adversaries, e.g., other types or larger perturbations compared to those used during training. Code is available at https://github.com/alinlab/consistency-adversarial.

BibTeX

@inproceedings{tack2022consistency,
  title={Consistency Regularization for Adversarial Robustness},
  author={Jihoon Tack and Sihyun Yu and Jongheon Jeong and Minseon Kim and Sung Ju Hwang and Jinwoo Shin},
  booktitle={AAAI Conference on Artificial Intelligence},
  year={2022}
}

2021

  Code   Talk   Slides   Poster

tl;dr: Overconfident inputs nearby the data may cause adversarial vulnerability in randomized smoothing, and regularizing them toward the uniform confidence improves robustness.

  • Also appeared at ICML AdvML Workshop 2021
Additional information

Abstract

Randomized smoothing is currently a state-of-the-art method to construct a certifiably robust classifier from neural networks against $\ell_2$-adversarial perturbations. Under the paradigm, the robustness of a classifier is aligned with the prediction confidence, i.e., the higher confidence from a smoothed classifier implies the better robustness. This motivates us to rethink the fundamental trade-off between accuracy and robustness in terms of calibrating confidences of a smoothed classifier. In this paper, we propose a simple training scheme, coined SmoothMix, to control the robustness of smoothed classifiers via self-mixup: it trains on convex combinations of samples along the direction of adversarial perturbation for each input. The proposed procedure effectively identifies over-confident, near off-class samples as a cause of limited robustness in case of smoothed classifiers, and offers an intuitive way to adaptively set a new decision boundary between these samples for better robustness. Our experimental results demonstrate that the proposed method can significantly improve the certified $\ell_2$-robustness of smoothed classifiers compared to existing state-of-the-art robust training methods.

BibTeX

@inproceedings{jeong2021smoothmix,
  title={Smooth{Mix}: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness},
  author={Jongheon Jeong and Sejun Park and Minkyu Kim and Heung-Chang Lee and Doguk Kim and Jinwoo Shin},
  booktitle={Advances in Neural Information Processing Systems},
  year={2021},
  url={https://openreview.net/forum?id=nlEQMVBD359}
}

  Code

Additional information

Abstract

Semi-supervised learning (SSL) is one of the most promising paradigms to circumvent the expensive labeling cost for building a high-performance model. Most existing SSL methods conventionally assume both labeled and unlabeled data are drawn from the same (class) distribution. However, unlabeled data may include out-of-class samples in practice; those that cannot have one-hot encoded labels from a closed-set of classes in label data, i.e. unlabeled data is an open-set. In this paper, we introduce OpenCoS, a method for handling this realistic semi-supervised learning scenario based upon a recent framework of self-supervised visual representation learning. Specifically, we first observe that the out-of-class samples in the open-set unlabeled dataset can be identified effectively via self-supervised contrastive learning. Then, OpenCoS utilizes this information to overcome the failure modes in the existing state-of-the-art semi-supervised methods, by utilizing one-hot pseudo-labels and soft-labels for the identified in- and out-of-class unlabeled data, respectively. Our extensive experimental results show the effectiveness of OpenCoS, fixing up the state-of-the-art semi-supervised methods to be suitable for diverse scenarios involving open-set unlabeled data.

BibTeX

@misc{park2021opencos,
  title={Open{CoS}: Contrastive Semi-supervised Learning for Handling Open-set Unlabeled Data},
  author={Jongjin Park and Sukmin Yun and Jongheon Jeong and Jinwoo Shin},
  year={2021},
  eprint={2107.08943},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

  Code   Talk   Slides   Poster

tl;dr: We propose a novel discriminator of GAN showing that contrastive representation learning, e.g., SimCLR, and GAN can benefit each other when they are jointly trained.

Additional information

Abstract

Recent works in Generative Adversarial Networks (GANs) are actively revisiting various data augmentation techniques as an effective way to prevent discriminator overfitting. It is still unclear, however, that which augmentations could actually improve GANs, and in particular, how to apply a wider range of augmentations in training. In this paper, we propose a novel way to address these questions by incorporating a recent contrastive representation learning scheme into the GAN discriminator, coined ContraD. This "fusion" enables the discriminators to work with much stronger augmentations without increasing their training instability, thereby preventing the discriminator overfitting issue in GANs more effectively. Even better, we observe that the contrastive learning itself also benefits from our GAN training, i.e., by maintaining discriminative features between real and fake samples, suggesting a strong coherence between the two worlds: good contrastive representations are also good for GAN discriminators, and vice versa. Our experimental results show that GANs with ContraD consistently improve FID and IS compared to other recent techniques incorporating data augmentations, still maintaining highly discriminative features in the discriminator in terms of the linear evaluation. Finally, as a byproduct, we also show that our GANs trained in an unsupervised manner (without labels) can induce many conditional generative models via a simple latent sampling, leveraging the learned features of ContraD. Code is available at https://github.com/jh-jeong/ContraD.

BibTeX

@inproceedings{jeong2021training,
  title={Training {GAN}s with Stronger Augmentations via Contrastive Discriminator},
  author={Jongheon Jeong and Jinwoo Shin},
  booktitle={International Conference on Learning Representations},
  year={2021},
  url={https://openreview.net/forum?id=eo6U4CAwVmg}
}

2020

  Code   Talk   Slides   Poster

tl;dr: Consistency controls robustness in the world of randomized smoothing, like TRADES in adversarial training.

  • Also appeared at ICML UDL Workshop 2020
  • Won Qualcomm Innovation Fellowship Korea 2020
Additional information

Abstract

A recent technique of randomized smoothing has shown that the worst-case (adversarial) $\ell_2$-robustness can be transformed into the average-case Gaussian-robustness by "smoothing" a classifier, i.e., by considering the averaged prediction over Gaussian noise. In this paradigm, one should rethink the notion of adversarial robustness in terms of generalization ability of a classifier under noisy observations. We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise. This relationship allows us to design a robust training objective without approximating a non-existing smoothed classifier, e.g., via soft smoothing. Our experiments under various deep neural network architectures and datasets show that the "certified" $\ell_2$-robustness can be dramatically improved with the proposed regularization, even achieving better or comparable results to the state-of-the-art approaches with significantly less training costs and hyperparameters.

BibTeX

@inproceedings{jeong2020consistency,
 author = {Jeong, Jongheon and Shin, Jinwoo},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {H. Larochelle and M. Ranzato and R. Hadsell and M.F. Balcan and H. Lin},
 pages = {10558--10570},
 publisher = {Curran Associates, Inc.},
 title = {Consistency Regularization for Certified Robustness of Smoothed Classifiers},
 url = {https://proceedings.neurips.cc/paper/2020/file/77330e1330ae2b086e5bfcae50d9ffae-Paper.pdf},
 volume = {33},
 year = {2020}
}

  Code   Talk   Slides   Poster

tl;dr: Contrastive representations are surprisingly good at discriminating OOD samples, and contrasting also “OOD-like” augmentations can further improve their performances.

  • Won Qualcomm Innovation Fellowship Korea 2020
Additional information

Abstract

Novelty detection, i.e., identifying whether a given sample is drawn from outside the training distribution, is essential for reliable machine learning. To this end, there have been many attempts at learning a representation well-suited for novelty detection and designing a score based on such representation. In this paper, we propose a simple, yet effective method named contrasting shifted instances (CSI), inspired by the recent success on contrastive learning of visual representations. Specifically, in addition to contrasting a given sample with other instances as in conventional contrastive learning methods, our training scheme contrasts the sample with distributionally-shifted augmentations of itself. Based on this, we propose a new detection score that is specific to the proposed training scheme. Our experiments demonstrate the superiority of our method under various novelty detection scenarios, including unlabeled one-class, unlabeled multi-class and labeled multi-class settings, with various image benchmark datasets. Code and pre-trained models are available at https://github.com/alinlab/CSI.

BibTeX

@inproceedings{tack2020csi,
 author = {Tack, Jihoon and Mo, Sangwoo and Jeong, Jongheon and Shin, Jinwoo},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {H. Larochelle and M. Ranzato and R. Hadsell and M.F. Balcan and H. Lin},
 pages = {11839--11852},
 publisher = {Curran Associates, Inc.},
 title = {CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted Instances},
 url = {https://proceedings.neurips.cc/paper/2020/file/8965f76632d7672e7d3cf29c87ecaa0c-Paper.pdf},
 volume = {33},
 year = {2020}
}

  Code   Talk   Slides

tl;dr: Adversarial examples targeting Majority -> minority can play as surprisingly effective minority samples to prevent overfitting under class-imbalance.

Additional information

Abstract

In most real-world scenarios, labeled training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion. In this paper, we explore a novel yet simple way to alleviate this issue by augmenting less-frequent classes via translating samples (e.g., images) from more-frequent classes. This simple approach enables a classifier to learn more generalizable features of minority classes, by transferring and leveraging the diversity of the majority information. Our experimental results on a variety of class-imbalanced datasets show that the proposed method improves the generalization on minority classes significantly compared to other existing re-sampling or re-weighting methods. The performance of our method even surpasses those of previous state-of-the-art methods for the imbalanced classification.

BibTeX

@InProceedings{kim2020M2m,
  author = {Kim, Jaehyung and Jeong, Jongheon and Shin, Jinwoo},
  title = {M2m: Imbalanced Classification via Major-to-Minor Translation},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  month = {June},
  year = {2020}
}

2019

  Code   Talk   Slides   Poster

tl;dr: Any CNNs can become more efficient by “re-allocating” unnecessary channels to increase the kernel size.

Additional information

Abstract

Recent progress in deep convolutional neural networks (CNNs) have enabled a simple paradigm of architecture design: larger models typically achieve better accuracy. Due to this, in modern CNN architectures, it becomes more important to design models that generalize well under certain resource constraints, e.g. the number of parameters. In this paper, we propose a simple way to improve the capacity of any CNN model having large-scale features, without adding more parameters. In particular, we modify a standard convolutional layer to have a new functionality of channel-selectivity, so that the layer is trained to select important channels to re-distribute their parameters. Our experimental results under various CNN architectures and datasets demonstrate that the proposed new convolutional layer allows new optima that generalize better via efficient resource utilization, compared to the baseline.

BibTeX

@InProceedings{jeong2020training,
  title = 	 {Training {CNN}s with Selective Allocation of Channels},
  author =       {Jeong, Jongheon and Shin, Jinwoo},
  booktitle = 	 {Proceedings of the 36th International Conference on Machine Learning},
  pages = 	 {3080--3090},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
  volume = 	 {97},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v97/jeong19c/jeong19c.pdf},
  url = 	 {https://proceedings.mlr.press/v97/jeong19c.html}
}

2016


:briefcase:  Work Experience


:medal_sports:  Honors & Awards


:handshake:  Professional Services

  • Conference reviewers
    • Neural Information Processing Systems (NeurIPS): 2020 (Top 10%) / 2021 / 2022
    • International Conference on Learning Representations (ICLR): 2020 / 2021 / 2022
    • International Conference on Machine Learning (ICML): 2021 (Expert reviewer) / 2022
    • AAAI Conference on Artificial Intelligence (AAAI): 2021 / 2022
  • Journal reviewers
    • International Journal of Computer Vision (IJCV)
    • Transactions on Machine Learning Research (TMLR)
    • ACM Transactions on Modeling and Performance Evaluation of Computing Systems (ACM ToMPECS)