์ธ๊ณต์ง€๋Šฅ ๐ŸŒŒ/CS231n

CS231n 2๊ฐ• Image Classification Pipeline

23.8 2024. 3. 7. 13:52
๋ฐ˜์‘ํ˜•

 

Image Classification Problem

 

A core task in Computer Vision

๊ณ ์–‘์ด๋‚˜ ๊ฐ•์•„์ง€ ํŠธ๋Ÿญ๊ณผ ๊ฐ™์€ ์ด๋ฏธ์ง€๋ฅผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ฌธ์ œ๋Š” ์‚ฌ๋žŒ์—๊ฒŒ๋Š” ์‰ฝ์ง€๋งŒ ์ปดํ“จํ„ฐ์—๊ฒŒ๋Š” ์–ด๋ ค์šด ๋ฌธ์ œ์ด๋‹ค.

 

์šฐ๋ฆฌ๊ฐ€ ์ด๋ฏธ์ง€๋ฅผ ๋ฐ”๋ผ๋ณด๋Š” ๋ฐฉ์‹๊ณผ ์ปดํ“จํ„ฐ๊ฐ€ ๋ฐ”๋ผ๋ณด๋Š” ๋ฐฉ์‹์—๋Š” ์ฐจ์ด๊ฐ€ ์žˆ๋‹ค. ์ปดํ“จํ„ฐ๋Š” ์ด๋ฏธ์ง€๋ฅผ ํ”ฝ์…€์ด๋ผ๋Š” ๋‹จ์œ„๋กœ ์ฝ๊ฒŒ ๋œ๋‹ค. ์šฐ๋ฆฌ์˜ ์ด๋ฏธ์ง€๊ฐ€ 800 x 600 ์˜ x 3 (3 : channels RGB) ํฌ๊ธฐ๋ฅผ ๊ฐ€์ง„๋‹ค๊ณ  ํ•˜๋ฉด ์ปดํ“จํ„ฐ๋Š” 800 x 600 x 3 ๊ฐœ์˜ ์ˆซ์ž ์ •๋ณด๋กœ ์ด๋ฏธ์ง€๋ฅผ ์ฝ๋Š” ๊ฒƒ์ด๋‹ค. 

 

Semantic Gap

๊ณ ์–‘์ด ์‚ฌ์ง„์„ ์˜ˆ๋กœ ๋“ค์ž๋ฉด, ๊ณ ์–‘์ด๋Š” ์šฐ๋ฆฌ๊ฐ€ ์ด๋ฏธ์ง€์— ๋ถ€์—ฌํ•œ semantic label ์ด๋‹ค. ๊ณ ์–‘์ด๋ผ๋Š” semantic idea์™€ pixel ๊ฐ’ (์ด๋ฏธ์ง€ ๋ฐฐ์—ด) ์‚ฌ์ด์—๋Š” ํฐ gap์ด ์žˆ๋‹ค.

 

Challenges

  • Viewpoint variation
    ๊ฐ™์€ ๊ณ ์–‘์ด๋ฅผ ๋ณด๋”๋ผ๋„ ์นด๋ฉ”๋ผ์˜ ๊ฐ๋„๊ฐ€ ์กฐ๊ธˆ๋งŒ ๋‹ฌ๋ผ์ง€๋ฉด ์ปดํ“จํ„ฐ์—๊ฒŒ๋Š” ์™„์ „ํžˆ ๋‹ค๋ฅธ ์ˆซ์ž ๋ฐฐ์—ด๋กœ ์ฝํžˆ๋Š” ๊ฒƒ์ด๋‹ค.
  • Illumination
    ์กฐ๋ช… ์ฐจ์ด์— ๋”ฐ๋ผ ๊ณ ์–‘์ด๊ฐ€ ๋งค์šฐ ๋ฐ๊ฒŒ ๋ณด์ด๊ฑฐ๋‚˜ ์–ด๋‘ก๊ฒŒ ๋ณด์ผ ์ˆ˜ ์žˆ๋‹ค.
  • Deformation
    ๊ณ ์–‘์ด ์•ก์ฒด์„ค... ๊ณ ์–‘์ด๋Š” ๋งค์šฐ ๋‹ค์–‘ํ•œ ํ˜•ํƒœ๋กœ ๋ณ€ํ™”ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ๋‹ค ๊ฐ™์€ ๊ณ ์–‘์ด์ด๋‹ค.
  • Occlusion
    ๊ณ ์–‘์ด๊ฐ€ ๊ฐ€๋ ค์ ธ์„œ ์•ˆ๋ณด์—ฌ๋„ ์‚ฌ๋žŒ์€ ๊ณ ์–‘์ด๋ผ๊ณ  ๋ฐ”๋กœ ์ธ์‹ ๊ฐ€๋Šฅํ•˜์ง€๋งŒ ์ปดํ“จํ„ฐ๊ฐ€ ๊ฐ€๋ ค์ง„ ๊ณ ์–‘์ด๋ฅผ ์•Œ์•„๋ณด๋Š” ๊ฒƒ์€ ์–ด๋ ต๋‹ค.
  • Background Clutter
    ๋ฐฐ๊ฒฝ๊ณผ ๊ณ ์–‘์ด๊ฐ€ ๋น„์Šทํ•˜๋ฉด ์•Œ์•„๋ณด๊ธฐ ํž˜๋“ค ์ˆ˜ ์žˆ๋‹ค.
  • Intraclass variation
    ๊ฐ™์€ ๊ณ ์–‘์ด(class)๋ผ๋„ ๋‚˜์ด, ํฌ๊ธฐ, ์ƒ‰์ƒ ๋“ฑ์ด ๋‹ฌ๋ผ์„œ ๋‹ค์–‘ํ•œ ๋ชจ์Šต์ด ์กด์žฌํ•  ์ˆ˜ ์žˆ๋‹ค.

 

An image classifier

๊ณ ์–‘์ด๋ฅผ ์™„๋ฒฝํ•˜๊ฒŒ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์œ„ํ•œ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜๋Š” ๊ฒƒ์ด ๋ถˆ๊ฐ€๋Šฅํ•  ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์„ธ์ƒ์— ์žˆ๋Š” ๋ชจ๋“  ๊ฐ์ฒด๋ฅผ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ๋Š” ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜๋Š” ๊ฒƒ ๋˜ํ•œ ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค. ๊ทธ๋ ‡๊ธฐ์— ์šฐ๋ฆฌ๋Š” ๋ฌด์–ธ๊ฐ€ ๋‹ค๋ฅธ ์ ‘๊ทผ๋ฐฉ๋ฒ•์ด ํ•„์š”ํ•˜๋‹ค.

 

Data-Driven Approach

์šฐ๋ฆฌ๋Š” ๊ฐ์ฒด๋ฅผ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์œ„ํ•ด craft code ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์•„๋ž˜์˜ ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•ด์„œ ๊ฐ์ฒด๋ฅผ ๋ถ„๋ฅ˜ํ•  ๊ฒƒ์ด๋‹ค.

  1. Collect a dataset of images and labels
    ์˜จ๋ผ์ธ ์ƒ์—์„œ ๋‹ค์–‘ํ•œ ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ฐ€๋Šฅํ•œ ๋งŽ์€ ์ด๋ฏธ์ง€๋ฅผ ์ˆ˜์ง‘ํ•œ๋‹ค.
  2. Use Machine Learning to train a classifier
    ์ˆ˜์ง‘ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ๋จธ์‹ ๋Ÿฌ๋‹์„ ํ•™์Šต์‹œํ‚จ๋‹ค.
  3. Evaluate the classifier on new images
    ํ•™์Šต์‹œํ‚จ ๋ชจ๋ธ์„ ํ†ตํ•ด์„œ ์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜๊ณ  ํ‰๊ฐ€ํ•œ๋‹ค.

 

 

์•ž์œผ๋กœ๋Š” ์ด๋ฏธ์ง€๋ฅผ input์œผ๋กœ ๋ฐ›์•„์„œ classification result๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ํ•˜๋‚˜์˜ ํ•จ์ˆ˜๊ฐ€ ์•„๋‹ˆ๋ผ,

train๊ณผ predict ๋‘ ๊ฐœ์˜ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•  ๊ฑฐ๋‹ค.

 

train ํ•จ์ˆ˜๋Š” ์ด๋ฏธ์ง€์™€ ๋ ˆ์ด๋ธ”์„ ์ž…๋ ฅ ๋ฐ›์•„์„œ model ์ด๋ผ๋Š” ๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅํ•ด๋‚ด๊ณ 

predict ํ•จ์ˆ˜๋Š” model์— ์ž…๋ ฅ ๊ฐ’์„ ์ค˜์„œ ์ด๋ฏธ์ง€์˜ classification ๊ฒฐ๊ณผ๋ฅผ ์˜ˆ์ธกํ•œ๋‹ค.

 

 

 

 

 

Classifier

 

Nearest Neighbor

Nearest Neighbor(์ตœ๊ทผ์ ‘ ์ด์›ƒ) algorithm์€ ์กฐ๊ธˆ์€ ๋ฐ”๋ณด๊ฐ™์€ ๋ฐฉ๋ฒ•์ด์ง€๋งŒ, ๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ classifier ์ด๋‹ค. 

 

def train(images, labels) :
	# Machine learning
    # Memorize all data and labels
	return model
    
def predict(model, test_images) :
	# Use model to predict labels
    # Predict the label of the most similar training image
    return test_labels

 

train ๊ณผ์ •์—์„œ ๋ชจ๋“  training data๋ฅผ ์™ธ์šฐ๊ณ , prediction ์—์„œ๋Š” ์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€๋ฅผ ์ž…๋ ฅ ๋ฐ›์•„ ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹์—์„œ ๊ฐ€์žฅ ์œ ์‚ฌํ•œ ์ด๋ฏธ์ง€๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ด๋‹ค. 

 

Distance Metric

๊ทธ๋Ÿฐ๋ฐ Nearest Neighbor algorithm์—์„œ๋Š” ์ด๋ฏธ์ง€๋ฅผ ์–ด๋–ป๊ฒŒ ๋น„๊ตํ• ๊นŒ?

์šฐ๋ฆฌ๋Š” L1 distance (Manhattan) ๋ผ๋Š” ๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€๋ฅผ ๋น„๊ตํ•œ๋‹ค.

L1 Distance์—์„œ๋Š” ์ด๋ฏธ์ง€์˜ ๊ฐ pixels์„ ๋น„๊ตํ•˜๋Š”๋ฐ, ์œ„์˜ ์‚ฌ์ง„์—์„œ ๋ณด๋ฉด test image์—์„œ training image์˜ ๊ฐ pixel ๊ฐ’์„ ๋‹จ์ˆœํ•˜๊ฒŒ ๋บ„์…ˆํ•œ๋‹ค. ์ดํ›„ ๋บ„์…ˆ์œผ๋กœ ๋งŒ๋“ค์–ด์ง„ ๋ชจ๋“  ๊ฐ’์„ ๋”ํ•ด์„œ ๋น„๊ตํ•œ๋‹ค.

 

import numpy as np

class NearestNeighbor:
    def __init__(self):
        pass

    def train(self, X, y):
        """ 
        X is N x D where each row is an example.
        Y is 1-dimension of size N """

        # the nearest neight classifier simply remembers all the training data
        self.Xtr = X
        self.ytr = y

    def predict(self, X):
        """ 
        X is N x D where each row is an example we wish to predict label for 
        """
        num_test = X.shape[0]

        #lets make sure that the output type mathces the input type
        Ypred = np.zeros(num_test, dtype=self.ytr.dtype)

        #loop over all test rows
        for i in range(num_test):
            #find the nearest training image to the i'th test image
            #using the L1 distance (sum of absolute value differences)
            distances = np.sum(np.abs(self.Xtr - X[i, :]), axis=1)
            min_index = np.argmin(distances) # get the index with smallest distance
            Ypred[i] = self.ytr[min_index]

        return Ypred

 

 

Q: With N examples, how fast are training and prediction?

A: Train O(1), predict O(N)

 

 

K-Nearest Neighbors

Nearest Neighbors์—์„œ๋Š” ๊ฐ€์žฅ ๊ทผ์ ‘ํ•œ ํ•˜๋‚˜์˜ ์ด์›ƒ๋งŒ์„ ๋ณด๊ธฐ์— ๋ช‡ ๊ฐ€์ง€ ๋ฌธ์ œ์ ์ด ์กด์žฌํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ๋ณด์™„ํ•˜๊ธฐ ์œ„ํ•ด ์šฐ๋ฆฌ๋Š” ๋‹จ์ˆœํžˆ ํ•˜๋‚˜์˜ ๊ทผ์ ‘ํ•œ ์ด์›ƒ๋งŒ ๋ณด๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ K๊ฐœ์˜ ๊ฐ€๊นŒ์šด ์ด์›ƒ์„ ์ฐพ์•„์„œ ๋‹ค์ˆ˜๊ฒฐ ํˆฌํ‘œ๋กœ label์„ ์˜ˆ์ธกํ•œ๋‹ค.

 

**

์œ„์˜ ์ด๋ฏธ์ง€์—์„œ ํฐ์ƒ‰ ๋ถ€๋ถ„์€ ๋ญ˜ ์˜๋ฏธํ• ๊นŒ?

The white regions are where there was no majority among the k-nearest neighbors

 

Distance Metric

 

points ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ๋ฅผ ๋น„๊ตํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ•์—๋Š” L1์™ธ์—๋„ L2 ๋ฐฉ๋ฒ•์ด ์žˆ๋‹ค. L2 distance๋Š” ๊ฐ point๊ฐ„์˜ ์ฐจ์˜ ์ œ๊ณฑ์˜ ํ•ฉ์— ์ œ๊ณฑ๊ทผ์„ ์”Œ์šดํ˜•ํƒœ์ด๋‹ค. 

 

์–ด๋– ํ•œ ๊ฑฐ๋ฆฌ metric์„ ์“ฐ๋Š๋ƒ์— ๋”ฐ๋ผ ๊ณต๊ฐ„์ƒ์—์„œ ๋‹ค๋ฅธ geometry๋‚˜ topology๋ฅผ ๊ฐ€์ง€๊ฒŒ ๋œ๋‹ค. L1 distance์˜ ๊ฒฝ์šฐ ์‚ฌ์‹ค ์›์ธ๋ฐ, ์‚ฌ๊ฐํ˜• ๋ชจ์–‘์„ ํ˜•์„ฑํ•œ๋‹ค. ์‚ฌ๊ฐํ˜•์—์„œ ๊ฐ points๋Š” ์›์ ์œผ๋กœ๋ถ€ํ„ฐ L1 ๊ฑฐ๋ฆฌ๊นŒ์ง€ ๋ชจ๋‘ ๋“ฑ๊ฑฐ๋ฆฌ์ด๋‹ค. L1 distance๋Š” ์ขŒํ‘œ๊ณ„ ์„ ํƒ์— ๋”ฐ๋ผ ๋‹ฌ๋ ค์žˆ๋‹ค.  ๋งŒ์•ฝ ์ขŒํ‘œ๋ฅผ ํšŒ์ „์‹œํ‚จ๋‹ค๋ฉด L1 distance๋„ ๋ณ€ํ•˜๊ฒŒ ๋œ๋‹ค. ๋ฐ˜๋ฉด์— L2 distance์—์„œ๋Š” ์ขŒํ‘œ๋ฅผ ๋ฐ”๊ฟ”๋„ distance๊ฐ€ ๋™์ผํ•˜๋‹ค. 

 

vector๋กœ์„œ input features๊ฐ€ ์œ ์˜๋ฏธํ•œ ์˜๋ฏธ๋ฅผ ๊ฐ€์ง„๋‹ค๋ฉด L1 distance๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹๊ณ , ๋งŒ์•ฝ ๊ณต๊ฐ„์ƒ์˜ vector๊ฐ€ ๊ทธ์ € genericํ•˜๊ณ  ๊ฐ elements์— ๋Œ€ํ•ด ์ž˜ ์•Œ์ง€ ๋ชปํ•œ๋‹ค๋ฉด L2 distance๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹๋‹ค. 

 

์˜ˆ๋ฅผ๋“ค์–ด์„œ ์ง์žฅ์ธ์„ ๋ถ„๋ฅ˜ํ•˜๋Š” task์—์„œ ์šฐ๋ฆฌ๊ฐ€ ๊ฐ€์ง„ vector features ๊ฐ€ ์—ฐ๋ด‰์ด๋‚˜ ๊ทผ๋ฌด ๊ธฐ๊ฐ„๋“ฑ๊ณผ ๊ฐ™์ด ์˜๋ฏธ๋ฅผ ๊ฐ€์ง„ ์ƒํ™ฉ์ด๋ผ๋ฉด L1์ด ๋” ์ ํ•ฉํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ!

 

 

L1 distance๋Š” coordinate axis๋ฅผ ๋”ฐ๋ฅด๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๋Š”๋ฐ ์ด๋Š” L1 ditsance๋Š” ์šฐ๋ฆฌ๊ฐ€ ์„ ํƒํ•œ ์ขŒํ‘œ๊ณ„์— ๋”ฐ๋ผ ๋‹ค๋ฅด๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. 

 

Setting Hyperparameters

K-nearest neighbors์—์„œ ๋‹ค๋ฅธ k๋ฅผ ๊ณ ๋ฅด๊ฑฐ๋‚˜ points ์‚ฌ์ด์—์„œ ๋‹ค๋ฅธ distance metrics๋ฅผ ๊ณ ๋ฅด๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ์šฐ๋ฆฌ๋Š” ์„ ํƒ์„ ํ•˜๊ฒŒ ๋˜๋Š”๋ฐ, ์ด๋Ÿฌํ•œ ์„ ํƒ์ง€๋“ค์„ hyperparameters ๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค.

 

Hyperparameter ๋Š” ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ๋ชจ๋ธ์ด ๋ฐฐ์šฐ๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์šฐ๋ฆฌ๊ฐ€ ์„ ํƒํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

 

๊ทธ๋ ‡๋‹ค๋ฉด ์–ด๋–ค ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์ข‹์€ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ์ผ๊นŒ?

=> problem์ด๋‚˜ data๋งˆ๋‹ค ๋‹ค๋ฅด๋‹ค!! ๊ฐ€์žฅ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ด๋Š” ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ง์ ‘ ๋ณด๊ณ  ์ฐพ์•„์•ผ ํ•œ๋‹ค.

 

 

ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ๊ฐ€์žฅ ์„ฑ๋Šฅ์ด ์ข‹๋‹ค๊ณ  ํ• ๋•Œ ์ด ์„ฑ๋Šฅ์€ ์–ด๋””์—์„œ ์ฐฉ์•ˆ๋œ ๊ฒƒ์ผ๊นŒ?

 

์šฐ๋ฆฌ์˜ ๋ฐ์ดํ„ฐ์…‹์„ train, validation, test์™€ ๊ฐ™์ด ์„ธ ๊ทธ๋ฃน์œผ๋กœ ๋‚˜๋ˆˆ๋‹ค.

training data๋ฅผ ์—ฌ๋Ÿฌ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ํ•™์Šต ์‹œ์ผœ๋ณด๊ณ  validation set์œผ๋กœ ํ‰๊ฐ€ํ•œ๋‹ค.

validation set์—์„œ ๊ฐ€์žฅ ์„ฑ๋Šฅ์ด ์šฐ์ˆ˜ํ–ˆ๋˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์„ ํƒํ•œ๋‹ค.

๋ชจ๋“  ์ž‘์—…์ด ๋๋‚œ ๋’ค validation set์—์„œ ๊ฐ€์žฅ ์„ฑ๋Šฅ์ด ์ข‹์€ classifier๋ฅผ ๊ฐ€์ง€๊ณ  test set์—์„œ ์„ฑ๋Šฅ ํ‰๊ฐ€๋ฅผ ํ•œ๋‹ค.

test dataset ์—์„œ ํ‰๊ฐ€ํ•œ ๊ฒฐ๊ณผ๊ฐ€ ๋ณดํ†ต ๋…ผ๋ฌธ์ด๋‚˜ report์— ์ ๊ฒŒ ๋˜๋Š” ์šฐ๋ฆฌ ๋ชจ๋ธ์˜ ์ตœ์ข… score๋‹ค.

 

Cross Validation

 

์ข‹์€ hyperparameters๋ฅผ ์„ค์ •ํ•˜๊ธฐ ์œ„ํ•œ ๋˜ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์€ Cross validation์ด๋‹ค. ๋‹ค๋งŒ ์ด ๋ฐฉ๋ฒ•์€ deep learning์—์„œ๋Š” ๊ฑฐ์˜ ์‚ฌ์šฉ๋˜์ง€ ์•Š๊ณ  ์ž‘์€ dataset์„ ๊ฐ€์ง€๋Š” ๊ฒฝ์šฐ์— ์‚ฌ์šฉ๋œ๋‹ค. 

 

Cross Validation์—์„œ๋Š” test dataset์„ ์ œ์™ธํ•œ ๋‚˜๋จธ์ง€ ๋ฐ์ดํ„ฐ์…‹์„ ๋‹จ์ˆœํžˆ ํ•˜๋‚˜์˜ train, validation dataset์œผ๋กœ ๋‚˜๋ˆ„๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์šฐ๋ฆฌ์˜ ๋ฐ์ดํ„ฐ์…‹์„ ์—ฌ๋Ÿฌ๊ฐœ์˜ folds๋กœ ๋‚˜๋ˆ„๋Š” ๊ฒƒ์ด๋‹ค. ๊ทธ๋ฆฌ๊ณ  validation dataset์œผ๋กœ ์‚ฌ์šฉํ•  dataset์„ rotationํ•˜๋Š” ๊ฒƒ์ด๋‹ค. 

 

 

๊ทธ๋ž˜ํ”„๋ฅผ ๋ณด๋ฉด ํ•˜๋‚˜์˜ x๊ฐ’์— 5๊ฐœ์˜ y๊ฐ’์ด ์žˆ๋Š”๋ฐ, ์ด๋Š” five fold cross validation์˜ ๊ฒฐ๊ณผ์ด๋‹ค. ์ด๋ฅผ ํ†ตํ•ด์„œ ๊ทธ์ € ํ•˜๋‚˜์˜ ๊ฒฐ๊ณผ๊ฐ€ ์•„๋‹ˆ๋ผ ๊ฐ๊ธฐ ๋‹ค๋ฅธ validaiton set์—์„œ์˜ ๊ฒฐ๊ณผ๋“ค์„ ์ข…ํ•ฉํ•œ ๋ถ„ํฌ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์–ด์„œ ๋” ์ข‹์€ hyperparameter๋ฅผ ์„ ํƒํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค. 

 

 

k-Nearest Neighbor on images never used.

K-Nearest Neighbor์€ ์ด๋ฏธ์ง€์—์„œ๋Š” ์“ฐ์ด์ง€ ์•Š๋Š”๋‹ค! ์œ„์˜ ์‚ฌ์ง„์€ ๋ชจ๋‘ ๋‹ค๋ฅธ ์ด๋ฏธ์ง€์ธ๋ฐ ๋ณ€ํ˜•๋œ ์„ธ๊ฐœ์˜ ์ด๋ฏธ์ง€ ๋ชจ๋‘ ์›๋ณธ ์ด๋ฏธ์ง€์™€ ๋น„๊ตํ–ˆ์„ ๋•Œ ๋™์ผํ•œ L2 distance๋ฅผ ๊ฐ€์กŒ๋‹ค.

 

์—ฌ๊ธฐ์„œ ๊ถ๊ธˆํ–ˆ๋˜ ์ ์ด ์ƒ๊ฒผ๋‹ค.

1. ์ด๋ฏธ์ง€๊ฐ€ ์กฐ๊ธˆ ๋ณ€ํ˜•๋˜๊ธด ํ–ˆ์ง€๋งŒ ์—ฌ์ „ํžˆ ๊ฐ™์€ ์ธ๋ฌผ์ด๊ธฐ์— ๋™์ผํ•œ L2 Distance๋ฅผ ๊ฐ€์ง€๋ฉด ์ข‹์€๊ฒŒ ์•„๋‹Œ๊ฐ€?

2. ์ด๋ฏธ์ง€๊ฐ€ ์–ด์ฐŒ๋๋“  ๋ฐ”๋€Œ์—ˆ๋Š”๋ฐ ์–ด๋–ป๊ฒŒ ๊ฐ™์€ L2 Distance๋ฅผ ๊ฐ€์ง€์ง€?...

 

๋‹คํ–‰ํžˆ ์ˆ˜์—… ์ค‘ ํ•™์ƒ๋“ค์ด ๋™์ผํ•œ ์งˆ๋ฌธ์„ ํ•ด์ค˜์„œ ๊ถ๊ธˆ์ฆ์ด ํ’€๋ ธ๋‹ค.

 

์šฐ์„  1๋ฒˆ์— ๋Œ€ํ•œ ๋‹ต๋ณ€์€ ํ•ด๋‹น ์˜ˆ์‹œ์—์„œ๋Š” ๊ทธ๋Ÿด ์ˆ˜๋„ ์žˆ์ง€๋งŒ ๋ฐ˜๋ก€๊ฐ€ ์žˆ์„ ์ˆ˜๋„ ์žˆ๋‹ค๋Š” ๊ฒƒ. ์˜ˆ๋ฅผ ๋“ค์–ด ์šฐ๋ฆฌ๊ฐ€ ์„œ๋กœ ๋‹ค๋ฅธ ๋‘ ๊ฐœ์˜ ์›๋ณธ ์ด๋ฏธ์ง€๊ฐ€ ์žˆ๊ณ  ์–ด๋–ค ์ ์ ˆํ•œ ์œ„์น˜์— box ์ฒ˜๋ฆฌํ•˜๊ฑฐ๋‚˜ ์ƒ‰์„ ๋”ํ•˜๊ฑฐ๋‚˜ ํ•˜๊ฒŒ ๋˜๋ฉด ๊ฒฐ๊ตญ ๋‘ ์ด๋ฏธ์ง€์˜ Distance๊ฐ€ ๊ฐ€๊นŒ์›Œ์ง€๊ฒŒ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ๋ฐ˜๋Œ€๋กœ ์œ„์˜ ์˜ˆ์‹œ์—์„œ ๋˜‘๊ฐ™์€ ํ•˜๋‚˜์˜ ์ด๋ฏธ์ง€์— ์ž„์˜๋กœ shit๋‚˜ tinting์„ ํ•˜๊ฒŒ ๋˜๋ฉด Distnace๋Š” ๋ง˜๋Œ€๋กœ ๋ณ€ํ•˜๊ฒŒ ๋˜๋Š” ๊ฒƒ -> ๋‹ค์–‘ํ•œ ์„œ๋กœ ๋‹ค๋ฅธ ์ด๋ฏธ์ง€๋“ค์ด ๊ฐ™์€ Distance๋ฅผ ๊ฐ€์ง€๋Š” ๊ฒฝ์šฐ๋ผ๋ฉด ์ž˜ ๋ชป ๋  ์ˆ˜๋„ ์žˆ๋‹ค๋Š” ๊ฒƒ

์šฐ๋ฆฌ๊ฐ€ ๋‘ ๊ฐœ์˜ ์›๋ณธ ์ด๋ฏธ์ง€๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์„ ๋•Œ 

 

2๋ฒˆ ์งˆ๋ฌธ์— ๋Œ€ํ•œ ๋‹ต๋ณ€์€ ์กฐ๊ต๊ฐ€ ์„ธ ๊ฐœ์˜ ์ด๋ฏธ์ง€๋“ค์ด ์›๋ณธ๊ณผ ๋ชจ๋‘ ๋™์ผํ•œ L2 distance๋ฅผ ๊ฐ–๋„๋ก ์ด๋ฏธ์ง€๋ฅผ ๋งŒ๋“ค์–ด์„œ ๊ทธ๋žฌ๋˜ ๊ฒƒ..!

 

 

 

 

Linear Classification

 

Neural Networks๋Š” ๋ ˆ๊ณ  ๋ธ”๋Ÿญ๊ณผ๋„ ๊ฐ™๋‹ค๊ณ  ์–˜๊ธฐํ•œ๋‹ค. ์—ฌ๋Ÿฌ ์ข…๋ฅ˜์˜ components๋ฅผ ์Œ“์•„์„œ ํฐ network๋ฅผ ๋งŒ๋“ค์–ด ๋‚ผ ์ˆ˜ ์žˆ๋‹ค. ์—ฌ๊ธฐ์„œ ๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ component(block)์ด Linear classifier๋กœ, Linear networks๋Š” parametric model์˜ ๊ฐ€์žฅ basicํ•œ ๋ชจ๋ธ์ด๋‹ค. 

 

 

Parametric Approach

 

 

์šฐ๋ฆฌ์˜ Parametric model์€ ํฌ๊ฒŒ ๋‘ ๊ฐœ์˜ components๋ฅผ ๊ฐ€์ง„๋‹ค.

1. X: input data

2. W: set of parameters or weights 

 

K-nearest neighbors ์—๋Š” parameter๊ฐ€ ์—†์ด train data๋ฅผ test๋•Œ ํ†ต์œผ๋กœ ์‚ฌ์šฉํ–ˆ๋‹ค.

ํ•˜์ง€๋งŒ parametric approch์—์„œ๋Š” train data์˜ ์ •๋ณด๋ฅผ ์š”์•ฝํ•˜๊ณ  ์š”์•ฝํ•œ ์ •๋ณด๋ฅผ ํŒŒ๋ผ๋ฏธํ„ฐ W์— ๋ชจ์•„์ค€๋‹ค. ์ด๋ ‡๊ฒŒ ๋˜๋ฉด test์—์„œ ๋” ์ด์ƒ train data๊ฐ€ ํ•„์š” ์—†์–ด์ง„๋‹ค. -> model์ด efficientํ•˜๋ฉฐ light ํ•ด์ง„๋‹ค.

 

์š”์•ฝํ•˜๋ฉด ๋”ฅ๋Ÿฌ๋‹์€ ์ ์ ˆํ•œ ํ•จ์ˆ˜ F์˜ ๊ตฌ์กฐ๋ฅผ ์„ค๊ณ„ํ•˜๋Š” ์ผ์ธ ๊ฒƒ์ด๋‹ค.

 

๊ทธ๋ ‡๋‹ค๋ฉด ์–ด๋–  ๋ฐฉ์‹์œผ๋กœ W์™€ X๋ฅผ ์กฐํ•ฉํ•ด์•ผ ํ• ๊นŒ?

๊ฐ€์žฅ ์‰ฌ์šด ๋ฐฉ์‹์ด ๊ฐ€์ค‘์น˜ W์™€ ๋ฐ์ดํ„ฐ X๋ฅผ ๊ณฑํ•˜๋Š” ๊ฒƒ์ธ๋ฐ ์ด ๋ฐฉ๋ฒ•์ด Linear classification์ด๋‹ค.

 

f(x, W) = Wx + b

์—ฌ๊ธฐ์„œ b๋Š” bias์ธ๋ฐ bias๋Š” ํŠน์ • ํด๋ž˜์Šค์— ์šฐ์„ ๊ถŒ์„ ๋ถ€์—ฌํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๊ณ ์–‘์ด ํด๋ž˜์Šค๊ฐ€ ๊ฐ•์•„์ง€ ํด๋ž˜์ˆ˜๋ณด๋‹ค ๋งŽ์€ ์ƒํ™ฉ์ด๋ผ๋ฉด ๊ณ ์–‘์ด ํด๋ž˜์Šค์— ์ƒ์‘ํ•˜๋Š” ๋ฐ”์ด์–ด์Šค๊ฐ€ ๋” ์ปค์ง€๊ฒŒ ๋˜๋Š” ๊ฒƒ์ด๋‹ค.

 

Example with an image with 4 pixels, and 3 classes (cat/dog/ship)

 

Linear classification์€ ํƒฌํ”Œ๋ฆฟ ๋งค์นญ์ด๋ผ๊ณ  ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ๋‹ค.

 

์œ„์—์„œ W์˜ ๊ฐ ํ–‰์€ ๊ฐ ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ํ…œํ”Œ๋ฆฟ์œผ๋กœ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

์ด ํ–‰ ๋ฒกํ„ฐ(1 x 4)์™€ ์ด๋ฏธ์ง€์˜ (4 x 1) ์—ด๋ฒกํ„ฐ ๊ฐ„์˜ ๋‚ด์ ์„ ๊ณ„์‚ฐํ•˜๊ฒŒ ๋˜๋Š”๋ฐ ์—ฌ๊ธฐ์„œ ๋‚ด์ ์€ ๊ฒฐ๊ตญ ํด๋ž˜์Šค ๊ฐ„ ํƒฌํ”Œ๋ฆฟ์˜ ์œ ์‚ฌ๋„๋ฅผ ์ธก์ •ํ•˜๋Š” ํ–‰์œ„๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

 

bias๋Š” ๋ฐ์ดํ„ฐ ๋…๋ฆฝ์ ์œผ๋กœ ๊ฐ ํด๋ž˜์Šค์— scaling offsets์„ ๋”ํ•ด์ฃผ๋Š” ๊ฒƒ์ด๋‹ค.

 

Interpreting a Linear Classifier

 

๊ฐ€์ค‘์ง€ W์˜ ๊ฐ ํ–‰์„ ์‹œ๊ฐํ™” ํ•ด๋ณด๋ฉด linear classifier๊ฐ€ ์ด๋ฏธ์ง€๋ฅผ ์ธ์‹ํ•˜๊ธฐ ์œ„ํ•ด ์–ด๋–ค ํ…œํ”Œ๋ฆฟ์„ ์‚ฌ์šฉํ•˜๋Š”์ง€ ์ง์ž‘ํ•ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ์—ฌ๊ธฐ์„œ ๋ฌธ์ œ์ ์€ Linear classifier๋Š” ๊ฐ ํด๋ž˜์Šค์— ๋Œ€ํ•ด ๋‹จ ํ•˜๋‚˜์˜ ํ…œํ”Œ๋ฆฟ๋งŒ์„ ํ•™์Šตํ•œ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ํ•œ ํด๋ž˜์Šค๋ผ๋„ ๋‹ค์–‘ํ•œ ํŠน์ง•๋“ค์ด ์กด์žฌํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ ์ด๋ฅผ ํ‰๊ท ํ™” ์‹œํ‚ค๊ธฐ์— ๋‹ค์–‘ํ•œ ํŠน์ง•์„ ํ…œํ”Œ๋ฆฟ ํ•˜๋‚˜๋กœ๋งŒ ์š”์•ฝํ•ด์•ผ ํ•œ๋‹ค.

 

Hard cases for a linear classifier

 

์ฒซ ๋ฒˆ์งธ ์˜ˆ์‹œ๋Š” ํ™€/์ง์„ ๋ถ„๋ฅ˜ํ•˜๋Š” parity problem(๋ฐ˜์ „์„ฑ ๋ฌธ์ œ)์ธ๋ฐ ์ด๋Š” linear classificationd์œผ๋กœ๋Š” ํ’€๊ธฐ ํž˜๋“  ๋ฌธ์ œ์ด๋‹ค.

 

์„ธ ๋ฒˆ์งธ ์˜ˆ์‹œ๋ฅผ ๋ณด๋ฉด ํŒŒ๋ž€์ƒ‰ ์„ธ ๊ฐœ์˜ ์ ๋“ค์ด ์žˆ๋Š”๋ฐ ์ด ์  ์™ธ์˜ ์˜์—ญ์€ ๋ชจ๋‘ ๋นจ๊ฐ„ ์˜์—ญ์ด๋‹ค. ์ด๋Ÿฐ ๊ฒฝ์šฐ Multimodal problem์ด๋ผ๊ณ  ๋ถ€๋ฅด๋Š”๋ฐ ํ•œ ํด๋ž˜์Šค๊ฐ€ ๋‹ค์–‘ํ•œ ๊ณต๊ฐ„์— ๋ถ„ํฌํ•  ์ˆ˜ ์žˆ๋Š” ๋ฌธ์ œ๋กœ ์ด ๋ฌธ์ œ ์—ญ์‹œ linear classification์ด ํ’€๊ธฐ ํž˜๋“  ๋ฌธ์ œ์ด๋‹ค. 

728x90
๋ฐ˜์‘ํ˜•

'์ธ๊ณต์ง€๋Šฅ ๐ŸŒŒ > CS231n' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

CS231n 5๊ฐ• Convolutional Neural Networks  (0) 2024.03.20
CS231n 4๊ฐ• Introduction to Neural Networks  (0) 2024.03.20
CS231n 3๊ฐ• Loss Functions and Optimization  (1) 2024.03.07