The Python NumPy library has many sub-modules within it with extra functions and features to assist you. One of these many modules is the “random” module, which we can use to generate random data with Python NumPy.
You may have previously heard about other libraries for generating random data. The NumPy Random module however, comes with some extra functions that you won’t find elsewhere. We will be discussing some of these in today’s tutorial.
Basic Random Data Generation with NumPy
There are various functions for the generation of random data in Numpy, each with a slightly unique twist. Let’s discuss the one by one. (All of these functions are found in the
Generating Random Integers
The first and probably the most commonly used one is
randint(). There many different ways we can use this function, depending on the number of parameters used.
The most simple case involves passing a single parameter.
import numpy as np print(np.random.randint(100))
This will print out a number between 0 to 100.
In the above example, 0 is assumed as the starting point for the range from which the random number is picked. By passing in two parameters, we can change the starting point.
This code will print out a number from between 50 and 100.
By passing in a third parameter, we can have this function return more than one number in the form of an array. The third parameter defines how many values will be returned. By default this value is 1.
print(np.random.randint(0, 10, 3))
[2 3 9]
As you can see, the above code returned 3 values.
The third parameter is actually called
size parameter. You can further modify it and by passing in a tuple containing two integers, you can have it return a 2D array of random numbers.
print(np.random.randint(0, 10, size = (3, 4)))
[[2 8 3 5] [6 2 8 9] [2 7 5 7]]
Generating Random Floats
To generate random floats, we have a different function called
rand(). This function can be called without passing any parameters, and will generate a random float in the range 0 to 1.
By passing an integer
n, you can have a array of size
n returned with random float values.
[0.50132969 0.94523932 0.12451228 0.77109302 0.53581386]
You can also generate 2D random float arrays by passing in a second parameter.
[[0.60346791 0.40194638 0.20951285 0.33231324] [0.85746236 0.83145176 0.22490456 0.38462784] [0.61896829 0.45537222 0.47277243 0.54272509] [0.06944676 0.69805889 0.6685201 0.70558914] [0.50977764 0.72910723 0.42678105 0.98245785]]
Making Random Choices
We can use the
choice() to make random choices by feeding an iterable with some values. For example, if we give it a list of 10 values, it can be used to randomly pick a value(s) from it.
print(np.random.choice(["a", "b", "c", "d", "e"]))
You can add a second parameter to choose how many times you want a random choice to be made.
print(np.random.choice(["a", "b", "c", "d", "e"], 5))
['c' 'e' 'b' 'd' 'e']
You can also use choice() to generate random integers by simply giving it a list or numbers, or even a single number as shown below. (It acts similar to
randint() with one parameter)
You can also pass in a second parameter to return more values.
Using a Seed to generate Random Numbers
Numpy generates “random” data by using what we call a “seed”. A “seed” is a base value that is used to initialize a random number generator. Usually numpy (and other random number generators) use the system-time as a seed. It’s a good choice because it’s constantly changing and unique. This ensures that patterns are not repeated.
In certain applications however, you may wish to manually define a seed. An example of where this might be useful, is when multiple people are working on the same project. If they use a common seed, then they will be getting the same pattern, which will make debugging and collaboration easier.
You can observe the behavior of a seed here.
import numpy as np np.random.seed(0) print(np.random.rand(5)) np.random.seed(0) print(np.random.rand(5))
[0.5488135 0.71518937 0.60276338 0.54488318 0.4236548 ] [0.5488135 0.71518937 0.60276338 0.54488318 0.4236548 ]
The output of both are the exact same, because we reset the seed after the first output. This is just to show you that the seed follows a predictive pattern.
This marks the end of the How to generate random data with numpy Tutorial. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the tutorial content can be asked in the comments section below.