The Python story for Data Scientists — Part 2

Thoufeek
4 min readMay 30, 2021

I deliberately ended Part 1 mentioning Lists and Tuples without explaining what they were? No, it wasn't a cliff-hanger of any sorts. Actually, if you or me were to get excited of lists in python, I would suggest both of us to get a life. I ended Part 1 there because it seemed like a logical place to make a segmentation. This is where we start losing people. This is where the real shit begins. Any idiot could assign variables or create pointless expressions or typecast or capitalize. But to understand complex lists, tuples, dictionaries and sets and to effortlessly work with them demands skill. We are dumb no more.

Lists, Tuples, Dictionaries and Sets are all compound data types which can hold multiple values of any data types including compound data types. Lists and tuples look similar and are therefore brothers. Tuples being the oldest, has had a tough life. They are immutable. They are expressed within parenthesis with each element separated with a comma. tup_1 = (1,3.14,”pi”,false,(1,2)) is a tuple with all sorts of different datatypes within it. Remember when I said strings are more or less a list or a tuple? So any method that works on a string other than the exclusive string methods is doable on a tuple as well. We can index them the same way (tup_1[2]: 3.14 , tup_1[4][2]: 2 , tup_1[-2]: false), concatenate them using ‘+’, slice or stride them, and use len to return the length etc. But what does immutable mean? Once a tuple is created we cannot add, delete or replace any element in it. The guy is literally immovable.

But the younger brother Lists is the cool, popular guy. He looks and works similar like a Tuple but is a bit more open-minded. He is open to change viz. they are mutable. We can extend the list using extend(<new elements>) function, add more elements to the end using append(<new element>) function, split a string about a delimiter using the split(“<delimiter here>”) function returning a list. The guy is chill. But one thing he do mind is the referencing. list_b=list_a doesnt create a copy of list_b. They both refers to he same list in memory. Any change in list_b is also a change in list_a. So whenever we want to make a change to a list but still want an original copy, we clone the list first like list_b = list_a[:] and then change list_b. Also time and again he looks out for his socially awkward brother. Whenever we want to make a change to a tuple, we convert it to a list using list() function, makes the change and convert it back using tuple() function.

Impressed by the brothers? Wait till you meet their cousins. Dictionaries not only lets you store any kinds value but also leys you use your own key to index them, quite similar to how a real dictionary works. ratings_dict = {“Part 1”: 7, “Part 2”:[1,23,34], “Part 3”:(3,6.6,67)} is a dictionary where “Part 1”, “Part 2” and “Part 3” are keys and 7, [1,23,34] and (3,6.6,67) are the values which can be mutable, immutable or duplicate objects. Keys are used to access the values like indexes in lists. ratings_dict[“Part3”] would return (3,6.6,67). We can easily add a new entry like ratings_dict[“Part 4"] = 4.20, delete one like del(ratings_dict[‘Part 3’]), search for an element like “Part 5” in ratings_dict ( returning True if present and False if not, see all the keys like ratings_dict.keys() or values like ratings_dict.values().

Sets in python also works like the sets in real life. If you had been to the right math class, you would know that a set is an unordered collection of unique elements. We can define a set using curly brackets. Even if we try to sneak in a duplicate element into it, python detects them and picks only one. So my_set={1.2,2.3,2.3} is same as my_set={1.2,2.3}. I know what you are thinking. Doesn’t like sound like a list but with curls? Yes it does, but only one rule: no duplicates. We could therefore cast a list to a set using the set() function. my_set2=set([1,2,2]) creates a set my_set2: {1,2}. The best way in my experience to work with sets is to get an idea of Venn diagrams. Understanding set operations becomes a lot easier if we have the visualizations of Venn diagrams. Anyway, my_set.add(“added”) adds “added” to the set. But adding it again wouldn’t change a thing. my_set.remove(2) removes the item 2, “added” in my_set checks if there is an element “added” in the set and returns true or false.

There are still a lot of set operations we could perform on a given number of sets. myset & yourset would return the intersection of the two sets. Similarly we could find the union of two sets like myset.union(yourset). If yourset is a subset of myset, then yourset.issubset(myset) would return True and if not, False.

We now know what data in python means and how to hold them. Now what? Lets do something with them in Part 3.

--

--