Pandas How to create DataFrame with Random Values N x M

Python and Pandas are very useful when you need to generate some test / random / fake data. For example let say that there is a need of two dataframes:

5 columns with 500 rows of integer numbers
5 columns with 100 rows of random characters
3 columns and 10 rows with random decimals

Generate Dataframe with random numbers 5 colums 100 rows

The most common need for me is to generate Dataframe with random numbers(integers) from 0 to 100. This can be achieved by using numpy randint function:

np.random.randint(0,100,size=(100, 5))

This will be the code:

import pandas as pd
import numpy as np
df2 = pd.DataFrame(np.random.randint(0,100,size=(100, 5)), columns=list('ABCDF'))
df2.head()

the result of which is:

	A	B	C	D	F
0	19	71	99	21	5
1	85	89	38	40	83
2	95	29	1	11	22
3	39	26	43	43	93
4	6	1	33	14	54

Generate Dataframe with random characters 5 colums 500 rows

Another useful example might be generating dataframe with random characters. This can be achieved by using

pd.util.testing.rands(3)

result of which is:

'E0z'

in order to split the random generate string we are going to use built in function list.

The first part of the code is:

rand_chars = []
for i in range(0, 5):
    rand_chars.append(list(pd.util.testing.rands(100)))
rand_chars = list(map(list, zip(*rand_chars)))
rand_chars[0:5]

the result of which is:

[['4', '8', 'v', 'g', 'c'],
 ['d', '6', 'n', 'b', 'H'],
 ['D', 'g', 'I', 's', 'O'],
 ['0', 'h', 'm', 'z', 's'],
 ['T', 'n', 'c', 'U', 'S']]

You may notice that we are doing transpose of list of lists by:

rand_chars = list(map(list, zip(*rand_chars)))

Finally we are creating the DataFrame:

df2 = pd.DataFrame(rand_chars)
df2.head()

result:

	0	1	2	3	4
0	H	L	x	s	3
1	S	Y	l	p	n
2	q	d	F	9	6
3	O	k	w	C	L
4	D	E	U	C	n

Generate Dataframe with random decimal numbers 3 colums 10 rows

The last example is generating dataframe with random floating point numbers.
In this example we are going to use:

np.random.rand(253, 3)

which gives:

array([[0.34322362, 0.58491385, 0.0421841 ],
       [0.72594607, 0.99322651, 0.72207976],
       [0.86410573, 0.92330185, 0.84427074]..]

and this is the full code:

pd.DataFrame(np.random.rand(10, 3) , columns=list('XYZ'))

result:

	x	y	x
0	0.769363	0.122776	0.880724
1	0.114435	0.658999	0.193133
2	0.547094	0.037303	0.058781
3	0.335808	0.359005	0.047081
4	0.787799	0.834477	0.594807
5	0.926310	0.653232	0.592580

> Python Basics

> Advanced Tutorials

> Python Errors

> Pandas Advanced

> Pandas Count

> Pandas Column

> Pandas Basics

> Pandas DataFrame

> Pandas Row

> User Interface

> Advanced

> Troubleshoot

> Video & Sound

> Linux Commands

> MySQL

> SQL Basics

> Python

> DB apps

> JupyterLab

> Jupyter Tips

> Jupyter Display

> Regex in Text Editor

> Regex Basics

> Regex Match

> Regex Date

> PyCharm Advanced

> Git and PyCharm

> PyCharm Error

> PyCharm Tips

> Linux Mint Applications

> VIrtual Machine

> Miscellaneous

> Java

> Automation

> Windows

> Office

> Cheat Sheet

Generate Dataframe with random numbers 5 colums 100 rows

Generate Dataframe with random characters 5 colums 500 rows

Generate Dataframe with random decimal numbers 3 colums 10 rows