Itertools module in python for data analytics - Python for Data Analytics

Python Programs | Python Tricks | Solution for problems | Data Cleaning | Data Science

Itertools module in python for data analytics

image courtesy: Dataquest.io
Itertools let you create combine multiple list into one or generate permutations, combinations of a set and to find subset in just a method call of itertools.

You can implement some fun and interesting tactics with it like creating infinite counting series of number, creating a cycle and make number repeat itself for times.

The module standardizes a core set of fast, memory efficient tools that are useful by themselves or in combination.
Let’s generate a sequence of infinite positive number starting from 5 using itertools method count():
It will generally return an iterative variable that you can use in your looping, but in code commented out is the output how it will be actually
import itertools
for i in itertools.count(5):
 if i == 100:
   break;
 print(i)
# 5 6 7 8 9 10 11 12 …
# passing additional argument to method call you can set step size
for i in itertools.count(5,0.5):
 if i == 100:
   break;
 print(i)
# 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 …

Always make sure to give breaking condition in your loop unless you don’t want a loop to continue forever

What if we have to repeat in a loop of ABCD only? Just a simple method call and pass the sequence of alphabet or digit or symbols. cycle() is method that make the task simple let’s see:

import itertools

count = 0;
for i in itertools.cycle('ABCD'):
 if count > 50:
   break;
 print(i); count = count + 1;
# A B C D A B C D A B C ….

Suppose from a long set you have to generate all possible subset that can be form from taking two elements at a time. Using itertools.combinations() you can get all possible subsets. Argument could be numpy array or a list or a tuple or just a string.

import itertools

#list we pass to combinations() as a set
ls = [5,6,8,7,9,10,12]
for i in itertools.combinations(ls,2):
 print(i)

Let’s look at another interesting method that will be always come in handy if you are doing data cleaning. Here we have list of natural numbers from which we want only positive numbers and all other negative numbers to be dropped.

import itertools

age = [18,19,17,20,0,-20,24,-1]
filtered_age = []
cond = itertools.filterfalse(lambda x : x<=0,age)
for a in cond:
 filtered_age.append(a)
print(filtered_age)

It will output a list that only contains positive age, though age cannot be 0 or negative unless we are time travelling back
+




Iterators terminating on the shortest input sequence:

Iterator
Arguments
Results
Example
accumulate()
p [,func]
p0, p0+p1, p0+p1+p2, …
accumulate([1,2,3,4,5]) --> 13 6 10 15
chain()
p, q, …
p0, p1, … plast, q0, q1, …
chain('ABC', 'DEF') --> A B CD E F
chain.from_iterable()
iterable
p0, p1, … plast, q0, q1, …
chain.from_iterable(['ABC','DEF']) --> A B C D E F
compress()
data, selectors
(d[0] if s[0]), (d[1] if s[1]), …
compress('ABCDEF',[1,0,1,0,1,1]) --> A C E F
dropwhile()
pred, seq
seq[n], seq[n+1], starting when pred fails
dropwhile(lambda x: x<5,[1,4,6,4,1]) --> 6 4 1
filterfalse()
pred, seq
elements of seq where pred(elem) is false
filterfalse(lambda x: x%2,range(10)) --> 0 2 4 6 8
groupby()
iterable[, key]
sub-iterators grouped by value of key(v)

islice()
seq, [start,] stop [, step]
elements from seq[start:stop:step]
islice('ABCDEFG', 2, None) -->C D E F G
starmap()
func, seq
func(*seq[0]), func(*seq[1]), …
starmap(pow, [(2,5), (3,2),(10,3)]) --> 32 9 1000
takewhile()
pred, seq
seq[0], seq[1], until pred fails
takewhile(lambda x: x<5,[1,4,6,4,1]) --> 1 4
tee()
it, n
it1, it2, … itn splits one iterator into n

zip_longest()
p, q, …
(p[0], q[0]), (p[1], q[1]), …
zip_longest('ABCD', 'xy',fillvalue='-') --> Ax By C- D-

Infinite iterators:

Iterator
Arguments
Results
Example
count()
start, [step]
start, start+step, start+2*step, …
count(10) --> 10 11 12 13 14...
cycle()
p
p0, p1, … plast, p0, p1, …
cycle('ABCD') --> A B C D AB C D ...
repeat()
elem [,n]
elem, elem, elem, … endlessly or up to n times
repeat(10, 3) --> 10 10 10

Combinatoric iterators:
Iterator
Arguments
Results
product()
p, q, … [repeat=1]
cartesian product, equivalent to a nested for-loop
permutations()
p[, r]
r-length tuples, all possible orderings, no repeated elements
combinations()
p, r
r-length tuples, in sorted order, no repeated elements
combinations_with_replacement()
p, r
r-length tuples, in sorted order, with repeated elements
product('ABCD', repeat=2)

AA AB AC AD BA BB BC BD CA CB  CC CD DADB DC DD
permutations('ABCD', 2)

AB AC AD BA BC BD CA CB CD  DA DB DC
combinations('ABCD', 2)

AB AC AD BC BD CD
combinations_with_replacement('ABCD',2)

AA AB AC AD BB BC BD CC CD DD

Source method list: official python documentation


15 comments:

  1. thank you for sharing useful information.
    web programming tutorial
    welookups

    ReplyDelete
  2. There are large number of websites that do not offer option to save data that are being displayed. This is exactly where, you are ought to hire web scraping services to scrape data from website.
    scraper bot

    data extraction services

    web crawling services

    web scraping services

    ReplyDelete
  3. Thanks for providing such a good Knowledge on Data Analysis With Python. Very knowledgeable Blog . Keep Sharing.

    ReplyDelete
  4. Very knowledgable Blog. Thanks for giving such a good Knowledge on Data Analysis With Python.Keep Sharing.

    ReplyDelete
  5. Great post! Thanks for sharing this amazing post on Data Analysis With Python, keep blogging.

    ReplyDelete
  6. This post on Data Analysis With Python is amazing & knowledgable. Keep Sharing.

    ReplyDelete
  7. Thankyou for sharing this wonderful knowledge on Data Analysis With Python. Keep Sharing.

    ReplyDelete
  8. Thankyou for sharing this wonderful knowledge on Data Analysis With Python. It is so much helpful for beginners like me .Keep Sharing.

    ReplyDelete
  9. Bollywood News in Hindi - Check out the latest Bollywood news, new Hindi movie reviews, box office collection updates
    Maidaan Full Movie Download & Review

    Bhuj Full Movie Download & Review

    ReplyDelete
  10. i really like the information that you have gathered and shared with us it is really informative and booming with vibrant knowledge on python, if such information intrigues you check out data science course in bangalore

    ReplyDelete
  11. Thanks for sharing knowledgeable and informative piece of content on Python. If you are beginner and want to make career in python programming, consider enrolling in Python Course with Job Placement in Jaipur

    ReplyDelete