Python Random sample() Method

Usage

The random.sample() method is used to randomly select a specified number of unique items from a sequence such as a list, tuple, or string.

Note that this method is designed for sampling without replacement, meaning it won’t select the same item multiple times.

Syntax

random.sample(population, k, *, counts=None)

Parameters

ParameterConditionDescription
populationRequiredA sequence you want to get the sample from.
It can be a list, set, range etc.
kRequiredAn integer specifying the size of the returned list.
countsOptionalA list representing the number of times an item can be repeated in the sample.
The default value of None implies sampling without replacement (each item can be chosen only once).

Note that the * is not a parameter. Rather, it indicates the end of the positional arguments. Every argument following the * must be specified as a keyword argument.

Return Value

The method returns a new list containing k unique elements chosen from the population. The original sequence remains unchanged.

Module Import

To use random.sample(), you’ll first need to import the random module.

import random

Examples

Here’s how you can use the random.sample() method. The following example returns a list of five unique numbers randomly chosen from my_list.

import random

# Define a list of numbers
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Select 5 unique random numbers from the list
print(random.sample(my_list, 5))
# Possible output: [8, 6, 2, 10, 7]

Similarly, you can use random.sample() with strings. In this case, it randomly selects unique characters from your string.

import random

# Define a string
my_string = "Hello"

# Select 3 unique random characters from the string
print(random.sample(my_string, 3))
# Possible output: ['l', 'o', 'l']

As mentioned earlier, the random.sample() method is designed for sampling without replacement, meaning it won’t select the same item multiple times. However, this does not mean that the items in the population must be unique. If there are repeated items in the population, then each occurrence is a possible selection in the sample.

colors = ['red', 'red', 'red', 'green', 'green', 'blue']
print(random.sample(colors, 4))
# Possible output: ['red', 'red', 'green', 'red']

Using the k Parameter

The k parameter in the random.sample() function controls the size of the random sample you want to extract.

It’s important to remember that you cannot request a sample larger than the original population itself. Attempting to do so will result in a ValueError.

my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print(random.sample(my_list, 12))
# ValueError: Sample larger than population or is negative

Using the counts Parameter

Sometimes, you might want to create a random sample where specific elements from your original sequence can be chosen multiple times.

While you could achieve this by listing the repeated elements directly within the population, the random.sample() function offers a more concise way using the optional counts parameter. This parameter lets you specify how many times each unique element is eligible to appear in your sample.

For example, suppose you have a list of colors:

colors = ['red', 'green', 'blue']

You want to create a random sample where ‘red’ can appear up to three times, ‘green’ up to twice, and ‘blue’ only once. Instead of explicitly writing out the repeated colors like this:

colors = ['red', 'red', 'red', 'green', 'green', 'blue']

You can use the counts parameter to streamline the process:

colors = ['red', 'green', 'blue']  # Unique colors
counts = [3, 2, 1]  # Corresponding repetition counts

random_sample = random.sample(colors, 4, counts=counts)
print(random_sample)
# Possible output: ['red', 'red', 'blue', 'red']

So the following two calls will produce equivalent results:

random.sample(['red', 'green', 'blue'], 4, counts=[3, 2, 1])
random.sample(['red', 'red', 'red', 'green', 'green', 'blue'], 4)

While both calls represent the same underlying idea, the counts parameter simply offers a cleaner way to express this.

Points to Note about the counts parameter

  • The total number of potential selections you define in the counts list must be at least as large as your desired sample size k. Otherwise, you’ll get a ValueError because it’s impossible to create the sample you’ve requested.
  • Once an element’s count reaches zero (it’s been selected the maximum allowed number of times), it won’t be eligible for selection again – even if you haven’t reached your full sample size k yet.
  • If you don’t use the counts parameter, or set it to None, the random.sample() function assumes that each element in the population can only be chosen once. This is the standard “sampling without replacement” behavior, ensuring you get a list of unique items.

Controlling the Randomness using random.seed()

The random.sample() method in Python uses a deterministic algorithm to generate random selections from a sequence. This might sound contradictory, but the “randomness” comes from an initial value known as a “seed.” When you use the same seed value, the algorithm produces the same sequence of random numbers, making the output predictable and repeatable.

Let’s see how you can set the seed value using random.seed() and control the randomness:

import random

# Set seed for repeatability
random.seed(5)

# Return a list of 5 unique random numbers chosen from my_list
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print(random.sample(my_list, 5))
# Possible output: [10, 5, 6, 7, 8]

Running the same block of code again will produce the same output because the seed value is the same.

print(random.sample(my_list, 5))
# Output: [10, 5, 6, 7, 8]

Using consistent seeds is very helpful during debugging, as it lets you isolate issues by reproducing the exact same random behavior. It’s also valuable when you need your results to be replicable, like in certain simulations or experiments.