Generator In Python

Improved Performance. The generator created by xrange will generate each number, which sum will consume to accumulate the sum. In the case of the 'range' function, using it as an iterable is the dominant use-case, and this is reflected in Python 3.x, which makes the range built-in return a sequence-type object instead of a list. Python - Generator. Python provides a generator to create your own iterator function. A generator is a special type of function which does not return a single value, instead it returns an iterator object with a sequence of values. In a generator function, a yield statement is used rather than a return statement.

— Generate pseudo-random numbers Source code: This module implements pseudo-random number generators for various distributions. For integers, uniform selection from a range.

For sequences, uniform selection of a random element, a function to generate a random permutation of a list in-place, and a function for random sampling without replacement. On the real line, there are functions to compute uniform, normal (Gaussian), lognormal, negative exponential, gamma, and beta distributions. For generating distributions of angles, the von Mises distribution is available. Almost all module functions depend on the basic function, which generates a random float uniformly in the semi-open range 0.0, 1.0).

Python uses the Mersenne Twister as the core generator. It produces 53-bit precision floats and has a period of 2.19937-1. The underlying implementation in C is both fast and threadsafe.

The Mersenne Twister is one of the most extensively tested random number generators in existence. However, being completely deterministic, it is not suitable for all purposes, and is completely unsuitable for cryptographic purposes. The functions supplied by this module are actually bound methods of a hidden instance of the random.Random class.

You can instantiate your own instances of Random to get generators that don’t share state. This is especially useful for multi-threaded programs, creating a different instance of Random for each thread, and using the method to make it likely that the generated sequences seen by each thread don’t overlap. Class Random can also be subclassed if you want to use a different basic generator of your own devising: in that case, override the random, seed, getstate, setstate and jumpahead methods.

Optionally, a new generator can supply a getrandbits method — this allows to produce selections over an arbitrarily large range. New in version 2.4: the method. As an example of subclassing, the module provides the class that implements an alternative generator in pure Python.

The class provides a backward compatible way to reproduce results from earlier versions of Python, which used the Wichmann-Hill algorithm as the core generator. Note that this Wichmann-Hill generator can no longer be recommended: its period is too short by contemporary standards, and the sequence generated is known to fail some stringent randomness tests. See the references below for a recent variant that repairs these flaws.

Warning The pseudo-random generators of this module should not be used for security purposes. Use or if you require a cryptographically secure pseudo-random number generator. Bookkeeping functions: random. Seed ( a=None ) Initialize internal state of the random number generator. None or no argument seeds from current time or from an operating system specific randomness source if available (see the function for details on availability). If a is not None or an or a, then hash(a) is used instead. Note that the hash values for some types are nondeterministic when is enabled.

New in version 1.5.2. Randint ( a, b ) Return a random integer N such that a. New in version 2.3. Returns a new list containing elements from the population while leaving the original population unchanged.

The resulting list is in selection order so that all sub-slices will also be valid random samples. This allows raffle winners (the sample) to be partitioned into grand prize and second place winners (the subslices).

Members of the population need not be or unique. If the population contains repeats, then each occurrence is a possible selection in the sample.

To choose a sample from a range of integers, use an object as an argument. This is especially fast and space efficient for sampling from a large population: sample(xrange(10000000), 60). The following functions generate specific real-valued distributions. Function parameters are named after the corresponding variables in the distribution’s equation, as used in common mathematical practice; most of these equations can be found in any statistics text. Random ( ) Return the next random floating point number in the range 0.0, 1.0). Uniform ( a, b ) Return a random floating point number N such that a.

New in version 2.6. Betavariate ( alpha, beta ) Beta distribution. Conditions on the parameters are alpha 0 and beta 0.

Returned values range between 0 and 1. Expovariate ( lambd ) Exponential distribution.

Lambd is 1.0 divided by the desired mean. It should be nonzero. (The parameter would be called “lambda”, but that is a reserved word in Python.) Returned values range from 0 to positive infinity if lambd is positive, and from negative infinity to 0 if lambd is negative. Gammavariate ( alpha, beta ) Gamma distribution.

( Not the gamma function!) Conditions on the parameters are alpha 0 and beta 0. The probability distribution function is. X. ( alpha - 1 ).

math. Exp ( - x / beta ) pdf ( x ) = - math. Gamma ( alpha ).

beta. alpha random.

Gauss ( mu, sigma ) Gaussian distribution. Mu is the mean, and sigma is the standard deviation. This is slightly faster than the function defined below.

Lognormvariate ( mu, sigma ) Log normal distribution. If you take the natural logarithm of this distribution, you’ll get a normal distribution with mean mu and standard deviation sigma. Mu can have any value, and sigma must be greater than zero. Normalvariate ( mu, sigma ) Normal distribution.

Mu is the mean, and sigma is the standard deviation. Vonmisesvariate ( mu, kappa ) mu is the mean angle, expressed in radians between 0 and 2. pi, and kappa is the concentration parameter, which must be greater than or equal to zero. If kappa is equal to zero, this distribution reduces to a uniform random angle over the range 0 to 2. pi. Paretovariate ( alpha ) Pareto distribution.

Alpha is the shape parameter. Weibullvariate ( alpha, beta ) Weibull distribution. Alpha is the scale parameter and beta is the shape parameter.

Alternative Generators: class random. WichmannHill ( seed ) Class that implements the Wichmann-Hill algorithm as the core generator. Has all of the same methods as Random plus the method described below.

Because this class is implemented in pure Python, it is not threadsafe and may require locks between calls. The period of the generator is 6,953,607,871,644 which is small enough to require care that two independent random sequences do not overlap. Whseed ( x ) This is obsolete, supplied for bit-level compatibility with versions of Python prior to 2.1. See for details.

Does not guarantee that distinct integer arguments yield distinct internal states, and can yield no more than about 2.24 distinct internal states in all. Class random. SystemRandom ( seed ) Class that uses the function for generating random numbers from sources provided by the operating system. Not available on all systems. Does not rely on software state and sequences are not reproducible.

Accordingly, the and methods have no effect and are ignored. The and methods raise if called. Random # Random float x, 0.0 random. Uniform ( 1, 10 ) # Random float x, 1.0 random. Randint ( 1, 10 ) # Integer from 1 to 10, endpoints included 7 random. Randrange ( 0, 101, 2 ) # Even integer from 0 to 100 26 random.

Choice ( 'abcdefghij' ) # Choose a random element 'c' items = 1, 2, 3, 4, 5, 6, 7 random. Shuffle ( items ) items 7, 3, 2, 5, 6, 4, 1 random.

Sample ( 1, 2, 3, 4, 5 , 3 ) # Choose 3 elements 4, 1, 5.

Generators give you lazy evaluation. You use them by iterating over them, either explicitly with 'for' or implicitly by passing it to any function or construct that iterates. You can think of generators as returning multiple items, as if they return a list, but instead of returning them all at once they return them one-by-one, and the generator function is paused until the next item is requested. Generators are good for calculating large sets of results (in particular calculations involving loops themselves) where you don't know if you are going to need all results, or where you don't want to allocate the memory for all results at the same time. Or for situations where the generator uses another generator, or consumes some other resource, and it's more convenient if that happened as late as possible.

Another use for generators (that is really the same) is to replace callbacks with iteration. In some situations you want a function to do a lot of work and occasionally report back to the caller. Traditionally you'd use a callback function for this. You pass this callback to the work-function and it would periodically call this callback. The generator approach is that the work-function (now a generator) knows nothing about the callback, and merely yields whenever it wants to report something.

The caller, instead of writing a separate callback and passing that to the work-function, does all the reporting work in a little 'for' loop around the generator. For example, say you wrote a 'filesystem search' program. You could perform the search in its entirety, collect the results and then display them one at a time. All of the results would have to be collected before you showed the first, and all of the results would be in memory at the same time. Or you could display the results while you find them, which would be more memory efficient and much friendlier towards the user.

The latter could be done by passing the result-printing function to the filesystem-search function, or it could be done by just making the search function a generator and iterating over the result. If you want to see an example of the latter two approaches, see os.path.walk (the old filesystem-walking function with callback) and os.walk (the new filesystem-walking generator.) Of course, if you really wanted to collect all results in a list, the generator approach is trivial to convert to the big-list approach: biglist = list(thegenerator). @StevenLu: Unless it goes to the trouble to manually launch threads before the yield and join them after to get the next result, it does not execute in parallel (and no standard library generator does this; secretly launching threads is frowned upon). The generator pauses at each yield until the next value is requested. If the generator is wrapping I/O, the OS might be proactively caching data from the file on the assumption it will be requested shortly, but that's the OS, Python isn't involved. – Jul 1 '16 at 2:28.

One of the reasons to use generator is to make the solution clearer for some kind of solutions. The other is to treat results one at a time, avoiding building huge lists of results that you would process separated anyway.

If you have a fibonacci-up-to-n function like this: # function version def fibon(n): a = b = 1 result = for i in xrange(n): result.append(a) a, b = b, a + b return result You can more easily write the function as this: # generator version def fibon(n): a = b = 1 for i in xrange(n): yield a a, b = b, a + b The function is clearer. And if you use the function like this: for x in fibon(1000000): print x, in this example, if using the generator version, the whole 1000000 item list won't be created at all, just one value at a time. That would not be the case when using the list version, where a list would be created first. I find this explanation which clears my doubt.

Because there is a possibility that person who don't know Generators also don't know about yield Return The return statement is where all the local variables are destroyed and the resulting value is given back (returned) to the caller. Should the same function be called some time later, the function will get a fresh new set of variables. Yield But what if the local variables aren't thrown away when we exit a function? This implies that we can resume the function where we left off. This is where the concept of generators are introduced and the yield statement resumes where the function left off. Def generateintegers(N): for i in xrange(N): yield i In 1: gen = generateintegers(3) In 2: gen In 3: gen.next 0 In 4: gen.next 1 In 5: gen.next So that's the difference between return and yield statements in Python.

Yield statement is what makes a function a generator function. So generators are a simple and powerful tool for creating iterators. They are written like regular functions, but they use the yield statement whenever they want to return data. Each time next is called, the generator resumes where it left off (it remembers all the data values and which statement was last executed). Real World Example Let's say you have 100 million domains in your MySQL table, and you would like to update Alexa rank for each domain. First thing you need is to select your domain names from the database.

Let's say your table name is domains and column name is domain. If you use SELECT domain FROM domains it's going to return 100 million rows which is going to consume lot of memory. So your server might crash. So you decided to run the program in batches. Let's say our batch size is 1000. In our first batch we will query the first 1000 rows, check Alexa rank for each domain and update the database row.

In our second batch we will work on the next 1000 rows. In our third batch it will be from 2001 to 3000 and so on.

Now we need a generator function which generates our batches. Here is our generator function: def ResultGenerator(cursor, batchsize=1000): while True: results = cursor.fetchmany(batchsize) if not results: break for result in results: yield result As you can see, our function keeps yielding the results.

If you used the keyword return instead of yield, then the whole function would be ended once it reached return. Return - returns only once yield - returns multiple times If a function uses the keyword yield then it's a generator. Now you can iterate like this: db = MySQLdb.connect(host='localhost', user='root', passwd='root', db='domains') cursor = db.cursor cursor.execute('SELECT domain FROM domains') for result in ResultGenerator(cursor): doSomethingWith(result) db.close. I have found that generators are very helpful in cleaning up your code and by giving you a very unique way to encapsulate and modularize code. In a situation where you need something to constantly spit out values based on its own internal processing and when that something needs to be called from anywhere in your code (and not just within a loop or a block for example), generators are the feature to use. The simple explanation: Consider a for statement for item in iterable: dostuff A lot of the time, all the items in iterable doesn't need to be there from the start, but can be generated on the fly as they're required.

This can be a lot more efficient in both. space (you never need to store all the items simultaneously) and. time (the iteration may finish before all the items are needed).

Other times, you don't even know all the items ahead of time. For example: for command in userinput: dostuffwith(command) You have no way of knowing all the user's commands beforehand, but you can use a nice loop like this if you have a generator handing you commands: def userinput: while True: waitforcommand cmd = getcommand yield cmd With generators you can also have iteration over infinite sequences, which is of course not possible when iterating over containers.

My favorite uses are 'filter' and 'reduce' operations. Let's say we're reading a file, and only want the lines which begin with '##'. Def filter2sharps( aSequence ): for l in aSequence: if l.startswith('##'): yield l We can then use the generator function in a proper loop source= file(. ) for line in filter2sharps( source.readlines ): print line source.close The reduce example is similar. Let's say we have a file where we need to locate blocks of.

Not HTML tags, but lines that happen to look tag-like. def reduceLocation( aSequence ): keep= False block= None for line in aSequence: if line.startswith('. A practical example where you could make use of a generator is if you have some kind of shape and you want to iterate over its corners, edges or whatever. For my own project (source code ) I had a rectangle: class Rect: def init(self, x, y, width, height): self.ltop = (x, y) self.rtop = (x+width, y) self.rbot = (x+width, y+height) self.lbot = (x, y+height) def iter(self): yield self.ltop yield self.rtop yield self.rbot yield self.lbot Now I can create a rectangle and loop over its corners: myrect=Rect(50, 50, 100, 100) for corner in myrect: print(corner) Instead of iter you could have a method itercorners and call that with for corner in myrect.itercorners. It's just more elegant to use iter since then we can use the class instance name directly in the for expression.