English 中文(简体)
What is the inverse of regularization strength in Logistic Regression? How should it affect my code?
原标题:

I am using sklearn.linear_model.LogisticRegression in scikit learn to run a Logistic Regression.

C : float, optional (default=1.0) Inverse of regularization strength;
    must be a positive float. Like in support vector machines, smaller
    values specify stronger regularization.

What does C mean here in simple terms? What is regularization strength?

最佳回答

Regularization is applying a penalty to increasing the magnitude of parameter values in order to reduce overfitting. When you train a model such as a logistic regression model, you are choosing parameters that give you the best fit to the data. This means minimizing the error between what the model predicts for your dependent variable given your data compared to what your dependent variable actually is.

The problem comes when you have a lot of parameters (a lot of independent variables) but not too much data. In this case, the model will often tailor the parameter values to idiosyncrasies in your data -- which means it fits your data almost perfectly. However because those idiosyncrasies don t appear in future data you see, your model predicts poorly.

To solve this, as well as minimizing the error as already discussed, you add to what is minimized and also minimize a function that penalizes large values of the parameters. Most often the function is λΣθj2, which is some constant λ times the sum of the squared parameter values θj2. The larger λ is the less likely it is that the parameters will be increased in magnitude simply to adjust for small perturbations in the data. In your case however, rather than specifying λ, you specify C=1/λ.

问题回答

In one sentence, regularization makes the model perform worse on training data so that it may perform better on holdout data.

Logistic regression is an optimization problem where the following objective function is minimized.

func1

where loss function looks like (at least for solver= lbfgs ) the following.

loss func

Regularization adds a norm of the coefficients to this function. The following implements the L2 penalty.

func2

From the equation, it s clear that the regularization term is there to penalize large coefficients (the minimization problem is solving for the coefficients that minimize the objective function). Since the size of each coefficient depends on the scale of its corresponding variable, scaling the data is required so that the regularization penalizes each variable equally. The regularization strength is determined by C and as C increases, the regularization term becomes smaller (and for extremely large C values, it s as if there is no regularization at all).

If the initial model is overfit (as in, it fits the training data too well), then adding a strong regularization term (with small C value) makes the model perform worse for the training data, but introducing such "noise" improves the model s performance on unseen (or test) data.


An example with 1000 samples and 200 features shown below. As can be seen from the plot of accuracy over different values of C, if C is large (with very little regularization), there is a big gap between how the model performs on training data and test data. However, as C decreases, the model performs worse on training data but performs better on test data (test accuracy increases). However, when C becomes too small (or the regularization becomes too strong), the model begins performing worse again because now the regularization term completely dominates the objective function.

C vs accuracy


Code used to to produce the graph:

import pandas as pd
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

# make sample data
X, y = make_classification(1000, 200, n_informative=195, random_state=2023)
# split into train-test datasets
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=2023)

# normalize the data
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

# train Logistic Regression models for different values of C
# and collect train and test accuracies
scores = {}
for C in (10**k for k in range(-6, 6)):
    lr = LogisticRegression(C=C)
    lr.fit(X_train, y_train)
    scores[C] = { train accuracy : lr.score(X_train, y_train), 
                  test accuracy : lr.score(X_test, y_test)}

# plot the accuracy scores for different values of C
pd.DataFrame.from_dict(scores,  index ).plot(logx=True, xlabel= C , ylabel= accuracy );




相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签