{"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. Look at the program on pp.74-75. [(a)] () Instead of the current form line 16, it could have been [fontsize=-2] if (m[rownum*n+k] == 1) sum++; Fill the blank with a term from our course: This would reduce unnecessary multiplications, but would increase . [(b)] () Using the official terms, state what kind of memory each of the following variables is likely stored in: rr variable & mem. type sum & (device matrix) & (device vector) & 2. () In the program on pp.91-92 give the line number at which the number of row pairs (r,s) with s r is computed for some fixed r. 3. () The program below is an OpenMP version of the mutual outlinks program. However, it uses a different way of apportioning work to threads. Suppose we have a 24x24 matrix, with 2 threads. Then thread 0 will be responsible for rows 0-6 and 17-23, while thread 1 will be responsible for rows 7-12 and 13-16. When I say that a thread is ``responsible'' for row i, it means that this thread will check all row pairs (i,j), . The motivation here is that the larger i is, the fewer row pairs there are for that i, and the ``mirror image'' scheme here (the range 17-23 is obtained from 0-6 by subtracting from 23) is meant to produce better load balancing. Fill in the blanks. [fontsize=-2] #include #include // OpenMP example: finds mean number of mutual outlinks, among all // pairs of Web sites in our set int n, // number of sites (will assume n is even) nth, // number of threads (will assume n/2 divisible by nth) *m, // link matrix tot = 0; // grand total of matches // processes row pairs (i,i+1), (i,i+2), ... int procpairs(int i) int j,k,sum=0; for (j = i+1; j < n; j++) for (k = 0; k < n; k++) sum += m[n*i+k] * m[n*j+k]; return sum; float dowork() #pragma omp parallel int pn1,pn2,i; int id = omp_get_thread_num(); nth = omp_get_num_threads(); int n2 = n / 2; int chunk = n2 / nth; // in checking (i,j) pairs, j > i, this thread will // process i from pn1 to pn2, inclusive, and the // \"mirror images\" of those i pn1 = // fill blank here pn2 = pn1 + chunk - 1; int mysum = 0; for (i = pn1; i <= pn2; i++) // put in 0-4 lines here // put in 0-4 lines here int divisor = // fill blank here return ((float) tot)/divisor; int main(int argc, char **argv) int n2 = n/2,i,j; n = atoi(argv[1]); // number of matrix rows/cols int msize = n * n * sizeof(int); m = (int *) malloc(msize); // as a test, fill matrix with random 1s and 0s for (i = 0; i < n; i++) m[n*i+i] = 0; for (j = 0; j < n; j++) if (j != i) m[i*n+j] = rand() if (n < 10) for (i = 0; i < n; i++) for (j = 0; j < n; j++) printf(\" printf(\"\"); tot = 0; float meanml = dowork(); printf(\"mean = Solutions: 1. thread divergence 2. register, global, global 3. [fontsize=-2] #include #include // OpenMP example: finds mean number of mutual outlinks, among all // pairs of Web sites in our set int n, // number of sites (will assume n is even) nth, // number of threads (will assume n/2 divisible by nth) *m, // link matrix tot = 0; // grand total of matches // processes row pairs (i,i+1), (i,i+2), ... int procpairs(int i) int j,k,sum=0; for (j = i+1; j < n; j++) for (k = 0; k < n; k++) sum += m[n*i+k] * m[n*j+k]; return sum; float dowork() #pragma omp parallel int pn1,pn2,i; int id = omp_get_thread_num(); nth = omp_get_num_threads(); int n2 = n / 2; int chunk = n2 / nth; // in checking (i,j) pairs, j > i, this thread will // process i from pn1 to pn2, inclusive, and the // \"mirror images\" of those i pn1 = id * chunk; pn2 = pn1 + chunk - 1; int mysum = 0; for (i = pn1; i <= pn2; i++) mysum += procpairs(i); mysum += procpairs(n-1-i); #pragma omp atomic tot += mysum; #pragma omp barrier int divisor = n * (n-1) / 2; return ((float) tot)/divisor; int main(int argc, char **argv) int n2 = n/2,i,j; n = atoi(argv[1]); // number of matrix rows/cols int msize = n * n * sizeof(int); m = (int *) malloc(msize); // as a test, fill matrix with random 1s and 0s for (i = 0; i < n; i++) m[n*i+i] = 0; for (j = 0; j < n; j++) if (j != i) m[i*n+j] = rand() if (n < 10) for (i = 0; i < n; i++) for (j = 0; j < n; j++) printf(\" printf(\"\"); tot = 0; float meanml = dowork(); printf(\"mean = ","course":"ECS158"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (25) Consider line 36, p.91 in the mutual outlinks example. Suppose we were to write this program in MPI, and we wanted each node to know the end grand total. Give the MPI call for this. (You'll need an MPI keyword not shown in the book, which is MPISUM.) 2. (25) Consider a scan of the set union operation. We'll represent subsets by vectors, so that for instance (0,1,1,1) would represent the subset consisting of objects 2, 3 and 4 of a 4-object full set, while (1,1,1,0,1) would mean the subset consisting of objects 1, 2, 3 and 5 in a 5-object full set. So in general for an n-element full set, each in Equation (7.1) would be an n-component vector. Define in terms of , and a function in the parallel matrix algorithms chapter. (No English, just symbols.) 3. (50) The CUDA code below implements the first stage of the parallel scan method in the book, for sums. It is presumed that after the kernel returns, the host will perform the final step, which is to update the individual scans found by the threads to compute the overall scan. The call will be [fontsize=-2] dim3 dimGrid(nblks,1); dim3 dimBlock(1,1,1); psum<<>>(dx,ds,n); Fill the gaps: [fontsize=-2] // places scan of this thread's section of x into // this thread's section of s; assumes n divisible // by block size; (not assumed efficient) __global__ void psum(int *x, int *s, int n) int i, bnum = blockIdx.x, bgn = ; // index of start of section fin = ; // index of end of section // fill gap with one statement // fill gap with loop Solutions: 1. [fontsize=-2] MPI_Allreduce(&sum,&tot,1,MPI_INT,MPI_SUM,MPI_COMM_WORLD); 2. 3. [fontsize=-2] __global__ void psum(int *x, int *s, int n) int i, bnum = blockIdx.x, npb = n / gridDim.x, bgn = bnum * npb, fin = bgn + npb - 1; s[bgn] = x[bgn]; for (i = bgn+1; i <= fin; i++) s[i] = s[i-1] + x[i]; ","course":"ECS158"} {"quiz":" xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. IMPORTANT NOTE: If you believe that nothing needs to be placed into a blank, simply give NA as your answer. 1. (40) You know that array padding is used to try to get better parallel access to memory banks. The code below is aimed to provide utilities to assist in this. Details are explained in the comments. [numbers=left,basicstyle=] #include #include // routines to initialize, read and write // padded versions of a matrix of floats; // the matrix is nominally mxn, but its // rows will be padded on the right ends, // so as to enable a stride of s down each // column; it is assumed that s >= n // allocate space for the padded matrix, // initially empty float *padmalloc(int m, int n, int s) return malloc(BLANKa); // store the value tostore in the matrix q, // at row i, column j; m, n and // s are as in padmalloc() above void setter(float *q, int m, int n, int s, int i, int j, float tostore) BLANKb // fetch the value in the matrix q, // at row i, column j; m, n and s are // as in padmalloc() above float getter(float *q, int m, int n, int s, int i, int j) BLANKc // test example int main() int i; float *q; q = padmalloc(2,2,3); setter(q,2,2,3,1,0,8); printf(\" // check, using GDB // Breakpoint 1, main () at padding.c:31 // 31 printf(\" // (gdb) x/6f q // 0x804b008: 0 0 0 8 // 0x804b018: 0 0 2. (60) The code below does root-finding. The problem and the strategy used by the code are explained in the comments. Pointers to functions are used. You probably have seen these before, but if not don't worry about it; it doesn't affect the parts of the code you must fill in. Suffice it to say that the user-supplied function does get called properly. [numbers=left,basicstyle=] #include #include // OpenMP example: root finding // the function f() is known to be negative // at a, positive at b, and to have exactly // one root in (a,b); the procedure runs // for niters iterations // strategy: in each iteration, the current // interval is split into nth equal parts, // and each thread checks its subinterval // for a sign change of f(); if one is // found, this subinterval becomes the // new current interval; the current guess // for the root is the left endpoint of the // current interval // of course, this approach is useful in // parallel only if f() is very expensive // to evaluate // for simplicity, assumes that no endpoint // of a subinterval will ever exactly // coincide with a root float root(float(*f)(float), float inita, float initb, int niters) BLANKa BLANKb int nth = omp_get_num_threads(); int me = omp_get_thread_num(); int iter; BLANKc for (iter = 0; iter < niters; iter++) BLANKd subintwidth = (currb - curra) / nth; myleft = curra + me * subintwidth; myright = myleft + subintwidth; if ((*f)(myleft) < 0 && (*f)(myright) > 0) curra = myleft; currb = myright; return curra; // example float testf(float x) return pow(x-2.1,3); int main(int argc, char **argv) // should print 2.1 printf(\" Solutions: 1. [numbers=left,basicstyle=] #include #include // routines to initialize, read and write // padded versions of a matrix of floats; // the matrix is nominally mxn, but its // rows will be padded on the right ends, // so as to enable a stride of s down each // column; it is assumed that s >= n // allocate space for the padded matrix, // initially empty float *padmalloc(int m, int n, int s) return(malloc(m*s*sizeof(float))); // store the value tostore in the matrix q, // at row i, column j; m, n and // s are as in padmalloc() above void setter(float *q, int m, int n, int s, int i, int j, float tostore) *(q + i*s+j) = tostore; // fetch the value in the matrix q, // at row i, column j; m, n and s are // as in padmalloc() above float getter(float *q, int m, int n, int s, int i, int j) return *(q + i*s+j); int main() int i; float *q; q = padmalloc(2,2,3); setter(q,2,2,3,1,0,8); printf(\" // check, using GDB // Breakpoint 1, main () at padding.c:31 // 31 printf(\" // (gdb) x/6f q // 0x804b008: 0 0 0 8 // 0x804b018: 0 0 2. [numbers=left,basicstyle=] #include #include // OpenMP example: root finding // the function f() is known to be negative // at a, positive at b, and thus has at // least one root in (a,b); if there are // multiple roots, only one is found; // the procedure runs for niters iterations // strategy: in each iteration, the current // interval is split into nth equal parts, // and each thread checks its subinterval // for a sign change of f(); if one is // found, this subinterval becomes the // new current interval; the current guess // for the root is the left endpoint of the // current interval // of course, this approach is useful in // parallel only if f() is very expensive // to evaluate // for simplicity, assumes that no endpoint // of a subinterval will ever exactly // coincide with a root float root(float(*f)(float), float inita, float initb, int niters) float curra = inita; float currb = initb; #pragma omp parallel int nth = omp_get_num_threads(); int me = omp_get_thread_num(); int iter; for (iter = 0; iter < niters; iter++) #pragma omp barrier float subintwidth = (currb - curra) / nth; float myleft = curra + me * subintwidth; float myright = myleft + subintwidth; if ((*f)(myleft) < 0 && (*f)(myright) > 0) curra = myleft; currb = myright; return curra; float testf(float x) return pow(x-2.1,3); int main(int argc, char **argv) printf(\" ","course":"ECS158"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. This problem concerns the Pthreads code for counting prime numbers, in pages 6ff. [(a)] (20) Fill in the blanks: There is a critical section in lines through . [(b)] (20) Fill in the blanks: If n is 100, then the amount of work printed out for a thread could range from to . 2. (20) Suppose a program that uses the shared-memory paradigm includes the code (given here in pseudocode) [fontsize=-2] if my_thread_number == 0 for i = 0 to number_ofthreads x[i] = i * i for i = 0 to number_ofthreads z[] = x[i] - y[i] next_line: Fill in the blank with a term from our course: In the line labeled nextline, we need a . 3. This problem concerns the MPI code for counting primes. in pages 11ff. [(a)] (20) When the number 25 is checked for primeness, how many times is it sent from one node to another? [(b)] (20) Suppose that instead of the three-node setting here, we were to rewrite the code for four nodes. There now would be a function Node3() that would look very similar to the old function Node2(). Referring to specific line numbers, state the differences (old line contents versus new ones) between the old Node2() and the new Node3(). Solutions: 1a 48, 49 1b 0, 3 2 barrier 3a 1 3b in line 134, and in line 141, ","course":"ECS158"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Answers to fill-ins must be result in one (not more) grammatically correct sentence. 1. (15) Consider a network for communication among PEs in the form of a ring. PE0 is connected to PE1, PE1 to PE2, and so on, and finally PE(n-1) is connected to PE0. At every clock cycle, PEi sends a packet to PE[(i+1) mod n], i = 0,1,...,n-1. Give the entries for the table on p.21 for this new network. 2. (15) Fill the blank: Consider a function f() in a threaded program. If it does not access any globals, its memory accesses should not cause cache coherency problems, because each thread has . 3. The Intel instruction set includes MOVS (``move string''), which will copy a specified number of bytes from one location in memory to another. However, the LOCK prefix is not allowed here. [(a)] (10) Show exactly where in our book this disallowance is confirmed. [(b)] (15) Fill the blank: It is disallowed because allowing it would cause . 4. (15) Suppose you see code, written by someone else, that includes a very long int array z on a shared-memory, 64-bit machine. References to z are only of the form z[128*i], z[128*j] and so on. Fill in the blanks: The likely motivation for this odd code is that the programmer is trying to avoid , and the size of this machine's is . 5. This problem concerns the OpenMP example implementing the Dijkstra algorithm, pages 40ff. [(a)] (20) Unlike the pseudocode on p.42, there is no Done array in the actual code. Suppose we were to add code to have such an array, named done. We could declare it between lines 17 and 18, allocate it between lines 33 and 34, and initialize its values to all 0s between lines 44 and 45. State the line number(s) at which we could change the code so to assign done' elements, and state the code needed for that. [(b)] (10) Suppose I had forgotten to write line 77. Which of the following would then be true? (i) The program would still work correctly. (ii) The program might or might not work correctly, depending on which random numbers were generated in init(). (iii) The program would definitely give incorrect results. (iv) The program would trigger a seg fault. (v) None of the above. Solutions: 1. latency , bandwidth , cost 2. ``...each thread has its own stack'' 3a. footnote 6, p.23 3b. ``...would cause the bus to be locked for too long'' 4. false sharing; cache blocks; 5a. Change line 102 to [fontsize=-2] notdone[mv] = 0; done[mv] = 1; 5b. (i) ","course":"ECS158"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Unless otherwise stated, give numerical answers as expressions, e.g. . Do NOT use calculators. 1. (40) Suppose we were to write an OpenMP version of the dice simulation example of the Python multiprocessing module. Say we store our grand total in a global variable tot, with count storing the thread's individual count. Show how to efficiently write the OpenMP version of [fontsize=-2] totlock.acquire() tot.value += count totlock.release() 2. (30) Consider the Python multiprocessing example of Quicksort, using the Queue class. Suppose the original array to be sorted was (12,5,13,6,8,10,2,21,20,15). When the work item (i,j,2) is placed into the queue, what will be the values of i and j? Note: The code in separate() is not quite right, but assume it works correctly, which is to rearrange the given range within xc so that all the elements smaller than xc[low] are moved to the left of that element, and all the ones larger than that element are moved to its right, with the return value last being the final resting place of xc[low]. 3. (30) The function allgt(x,y,n) below, to run in an OpenMP context, returns 1 (i.e. True) if all elements of x are greater than their counterparts in y. Each of the arrays x and y is of length n. (In this implementation, no effort is made to do no further checking after encountering a False case.) The function is to be called from within an OpenMP parallel block. Fill in the blanks. [fontsize=-2] int allgt(x,y,n) int all,i; #pragma omp _______________________________________ for (i = 0; i < n; i++) _____________________________ return all; Solutions: 1. [fontsize=-2] #pragma omp atomic tot += count; 2. The first call to separate() will rearrange the array to (5,6,8,10,2,12,13,21,20,15) and return 5, and (6,9,1) will be put in the queue. The next thread will work on that, and put (7,9,2) in the queue. 3. [fontsize=-2] int allgt(x,y,n) int all,i; #pragma omp for reduction(&&:all) for (i = 0; i < n; i++) all &&= (x[i] > y[i]); return all; ","course":"ECS158"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (30) Fill in the blanks in the following R code, which computes and returns the matrix A in Equation (10.20), C = AX/n. Note: R's exponentiation operator is , so that for instance is 8. R's diag() function is quite versatile. When applied to a matrix, you get a vector (formed from the diagonal of the matrix), and vice versa. [fontsize=-2] makeamat <- fqnction(n,q) m <- matrix(nrow=n,ncol=n) for (i in 1:n) for (j in i:n) if (i == j) # first blank (one line) else # second blank (one line) m[j,i] <- m[i,j] return(m) 2. (30) Fill in the blanks in the following R code to implement the Jacobi algorithm on a GPU. It solves ax = b, returning x. [fontsize=-2] library(gputools) jcb <- function(a,b,eps) n <- length(b) d <- diag(a) # a vector, not a matrix tmp <- diag(d) # a matrix, not a vector o <- # first blank (partial line) di <- 1/d x <- b # initial guess, could be better repeat oldx <- x tmp <- # second blank (partial line) tmp <- b - tmp x <- di * tmp # elementwise multiplication # third blank (one line) 3. (40) Fill in the blanks in the following R code, which implements smoothing (removal of ``blips'') on sound data (loudness) snd, cutting off frequencies for greater than maxidx in (10.13). R's fft() function, when applied to a vector x, finds the Discrete Fourier Transform of x. If the optional second argument inverse is set to TRUE, it will find the inverse transform instead. You'll find R's rep() function useful; e.g. rep(2,5) is the vector (2,2,2,2,2). [fontsize=-2] lp <- function(snd,maxidx) four <- # first blank (partial line) n <- length(four) newfour <- # second blank (partial line) # third blank (one line) Solutions: 1. [fontsize=-2] makeamat <- function(n,u) m <- matrix(nrow=n,ncol=n) for (i in 1:n) for (j in i:n) if (i == j) m[i,i] <- u^((i-1)^2) else m[i,j] <- u^((i-1)*(j-1)) m[j,i] <- m[i,j] return(m) 2. [fontsize=-2] library(gputools) jcb <- function(a,b,eps) n <- length(b) d <- diag(a) # a vector, not a matrix tmp <- diag(d) # a matrix, not a vector o <- a - diag(d) di <- 1/d x <- b # initial guess, could be better repeat oldx <- x tmp <- gpuMatMult(o,x) tmp <- b - tmp x <- di * tmp # elementwise multiplication if (sum(abs(x-oldx)) < n * eps) return(x) 3. [fontsize=-2] lp <- function(snd,maxidx) four <- fft(snd) n <- length(four) newfour <- c(four[1:maxidx],rep(0,n-maxidx)) return(Re(fft(newfour,inverse=T)/n)) ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), combn() etc. 1. (10) In a certain line in some MPI program, we wish to receive a message from node 0, consisting of an array of 10000 integers, placing it in y. The message type will be XTYPE. We have declarations int *y; MPI_Status stat; and have called malloc(), assigning the result to y. Give the call needed to receive the data. 2. A military commander in old Egypt, Amr ibn al-As, once wrote to a Caliph Umar, ``I will send to Medina a camel train so long that the first camel will reach you before the last one has left me.'' (True story.) Fill in the blanks with terms from our course: [(a)] (10) An increase in camel walk speed would reduce the . [(b)] (10) An increase in the time it takes to load a camel and start it on its journey would reduce the . 3. This problem concerns the function bsort() in Section 1.3.2.6 of our book. [(a)] (10) Briefly---in just one line in your electronic Quiz submission---discuss this algorithm in terms of the load balance issue. [(b)] (10) Give a single C statement, to go between lines 92 and 93, that will be executed by the last thread, that is, the thread with the numerically largest thread number. The code with print out the index (i.e. subscript) within x at which the chunk written by thread 1 begins. If for example thread 1 writes to x[12] through x[28], then 12 will be printed out. [(c)] (10) Internally, the pragma on line 43 likely makes multiple calls to which pthreads function? 4. (40) This will be a variant of the matrix-multiply Snow code in our text. The two main differences are that (a) we send each worker its assigned rows of the multiplier, rather than sending the full matrix and assigned row numbers, and (b) the multiplier matrix is upper-diagonal. Concerning (a), say we have two workers and the multiplier has 6 rows. Then in uv(), uchunks[[1]] will be the submatrix of u consisting of the first 3 rows of the latter. Concerning (b), we want to avoid wasteful multiplication by the 0s below the diagonal in u, but they are there, i.e. the matrix is not stored in compressed form. Fill in the blanks (only one line or partial line each): uv <- function(cls,u,v) rownums <- splitIndices(nrow(u),length(cls)) # tack on row numbers to u u1 <- cbind(u,1:nrow(u)) uchunks <- list() for (i in 1:length(cls)) uchunks[[i]] <- blank (a) res <- clusterApply(cls,uchunks,chunkmul,v) Reduce(c,res) chunkmul <- function(uchunk,v) nr <- nrow(uchunk) nc <- ncol(uchunk) ncu <- nc - 1 prod <- vector(length=nr) for (i in 1:nr) urownum <- uchunk[i,nc] rng <- blank (b) prod[i] <- blank(c) prod Solutions: 1. MPI_Recv(y,10000,MPI_INT,0,XTYPE,MPI_COMM_WORLD,&stat); 2.a latency 2.b bandwidth 3.a Load balance could be poor, say if the numbers are skewed toward their lower range. 3.b if (me == nth-1) printf(\" 3.c pthread_create() 4. uv <- function(cls,u,v) rownums <- splitIndices(nrow(u),length(cls)) # tack on row numbers to U u1 <- cbind(u,1:nrow(u)) uchunks <- list() for (i in 1:length(cls)) uchunks[[i]] <- u1[rownums[[i]],] res <- clusterApply(cls,uchunks,chunkmul,v) Reduce(c,res) chunkmul <- function(uchunk,v) nr <- nrow(uchunk) nc <- ncol(uchunk) ncu <- nc - 1 prod <- vector(length=nr) for (i in 1:nr) urownum <- uchunk[i,nc] rng <- urownum:ncu prod[i] <- uchunk[i,rng] prod source(\"umul.R\") c2 <- makeCluster(2) ","course":"ECS158"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Unless otherwise stated, give numerical answers as expressions, e.g. . Do NOT use calculators. 1. Consider a six-dimensional hypercube , subdivided into two five-dimensional hypercubes and . [(a)] () What is the node number of the partner of node 23? [(b)] () What is the node number of the root in ? [(c)] () Suppose our algorithm requires partners in the two 5-cubes to exchange their values of an int variable x. What would be the best MPI function for this purpose? 2. Consider the program on pp.85-87. [(a)] () Suppose that while running the program, someone runs the shell commands ps and gdb. At this point, the likely line number on which the program is running (at all nodes) is . [(b)] () Fill in the table regarding the actions of lines 107 and 108 and the array overallmin, at a given node. Mark an entry R if the array is read, W if it is RW if both, and N if neither: rrr node number & 107 & 108 0 & & 0 & & [(c)] () This example program is somewhat artificial, in that each node generates its data matrix ohd. Instead, say that node 0 has the matrix, say by reading it from disk, and wishes to distribute it to the other nodes. Give a single line of code that would replace lines 57-64, that would accomplish this distribution. Solutions: 1a. 23 + 32 = 55 1b. 100000, i.e. 32 1c. MPISendrecv() 2a. 70 2b. rrr node number & 107 & 108 0 & W & R 0 & N & W 2c. [fontsize=-2] MPI_Bcast(ohd,nv*nv,MPI_INT,0,MPI_COMM_WORLD); ","course":"ECS158"} {"quiz":" xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. IMPORTANT NOTE: If you believe that nothing needs to be placed into a blank, simply give Nothing as your answer. 1. (50) Consider the mutual outlinks example, beginning on p.93. Below is a different version of dowork(). One of the differences is that it uses dynamic loop scheduling. Another difference is that it doesn't use atomic or critical. Fill in the blanks. [numbers=left] float dowork() #pragma omp parallel int i; #pragma omp for BLANKa for (BLANKb) tot += procpairs(i); BLANKc BLANKd int divisor = n * (n-1) / 2; return ((float) tot)/divisor; 2. (50) Below is an MPI program that removes 0s from an array. The strategy is that first the manager node breaks the original array into equal-sized chunks, sending one for each worker. Each worker removes the 0s and sends back the nonzero elements. The manager collects these into an array no0s. Fill in the blanks below: [numbers=left] #include #include #define MAX_N 100000 #define MAX_NPROCS 100 #define DATA_MSG 0 #define NEWDATA_MSG 1 int nnodes, // number of MPI processes n, // size of original array me, // my MPI ID has0s[MAX_N], // original data no0s[MAX_N], // 0-free data nno0s; // number of non-0 elements int debug; // not shown init(int argc, char **argv) void managernode() MPI_Status status; int i; int lenchunk; // assumed divides evenly lenchunk = n / nnodes; for (i = 1; i < nnodes; i++) BLANKa int k = 0; for (i = 1; i < nnodes; i++) BLANKb BLANKc BLANKd nno0s = k; // not shown void remov0s(int *oldx, int n, int *newx, int *nnewx) void workernode() int lenchunk; MPI_Status status; BLANKe BLANKf remov0s(has0s,lenchunk,no0s,&nno0s); BLANKg // not shown int main(int argc,char **argv) Solutions: 1. [numbers=left] float dowork() #pragma omp parallel int i; #pragma omp for reduction(+:tot) schedule(dynamic) for (i = 0; i < n-1; i++) tot += procpairs(i); int divisor = n * (n-1) / 2; return ((float) tot)/divisor; 2. [numbers=left] #include #include #define MAX_N 100000 #define MAX_NPROCS 100 #define DATA_MSG 0 #define NEWDATA_MSG 1 int nnodes, // number of MPI processes n, // size of original array me, // my MPI ID has0s[MAX_N], // original data no0s[MAX_N], // 0-free data nno0s; // number of non-0 elements int debug; init(int argc, char **argv) int i; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&nnodes); MPI_Comm_rank(MPI_COMM_WORLD,&me); n = atoi(argv[1]); if (me == 0) for (i = 0; i < n; i++) has0s[i] = rand() else debug = atoi(argv[2]); while (debug) ; void managernode() MPI_Status status; int i; int lenchunk; lenchunk = n / (nnodes-1); // assumed divides evenly for (i = 1; i < nnodes; i++) MPI_Send(has0s+(i-1)*lenchunk,lenchunk, MPI_INT,i,DATA_MSG,MPI_COMM_WORLD); int k = 0; for (i = 1; i < nnodes; i++) MPI_Recv(no0s+k,MAX_N, MPI_INT,i,NEWDATA_MSG,MPI_COMM_WORLD,&status); MPI_Get_count(&status,MPI_INT,&lenchunk); k += lenchunk; nno0s = k; void remov0s(int *oldx, int n, int *newx, int *nnewx) int i,count = 0; for (i = 0; i < n; i++) if (oldx[i] != 0) newx[count++] = oldx[i]; *nnewx = count; void workernode() int lenchunk; MPI_Status status; MPI_Recv(has0s,MAX_N, MPI_INT,0,DATA_MSG,MPI_COMM_WORLD,&status); MPI_Get_count(&status,MPI_INT,&lenchunk); remov0s(has0s,lenchunk,no0s,&nno0s); MPI_Send(no0s,nno0s, MPI_INT,0,NEWDATA_MSG,MPI_COMM_WORLD); int main(int argc,char **argv) int i; init(argc,argv); if (me == 0 && n < 25) for (i = 0; i < n; i++) printf(\" printf(\"\"); if (me == 0) managernode(); else workernode(); if (me == 0 && n < 25) for (i = 0; i < n; i++) printf(\" printf(\"\"); MPI_Finalize(); ","course":"ECS158"} {"quiz":" xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. IMPORTANT NOTE: If you believe that nothing needs to be placed into a blank, simply give Nothing as your answer. 1. The CUDA code below converts an array to cumulative sums. For instance, if the original array is (3,1,2,0,3,0,1,2), then it is changed to (3,4,6,6,9,9,10,12). The general plan is for each thread to operate on one chunk of the array. A thread will find cumulative sums for its chunk, and then adjust them based on the high values of the chunks that precede it. In the above example, for instance, say we have 4 threads. The threads will first produce (3,4), (2,2), (3,3) and (1,3). Since thread 0 found a cumulative sum of 4 in the end, we must add 4 to each element of (2,2), yielding (6,6). Thread 1 had found a cumulative sum of 2 in the end, which together with the 4 found by thread 0 makes 6. Thus thread 2 must add 6 to each of its elements, i.e. add 6 to (3,3), yielding (9,9). The case of thread 3 is similar. Fill in the blanks. [numbers=left] // CUDA example // inputs an int array x of length n and // computes cumulative sums of the // elements, writing them back to x // for this simple illustration, it is // assumed that the code runs in just // one block, and that the number of threads // evenly divides n #include #include __global__ void cumulker(int *dx, int n) int me = BLANKa; int csize = n / BLANKb; int start = BLANKc * csize; int i,j,base; for (i = 1; i < csize; i++) BLANKd; dx[j] = dx[j-1] + dx[j]; BLANKe; if (BLANKf) base = 0; for (j = 0; j < me; j++) BLANKg; BLANKh; if (me > 0) for (BLANKi) dx[i] += base; int cumul(int *x, int n, int bsize) int *dx; // device copy of x BLANKi; BLANKj; dim3 dimGrid(1,1); dim3 dimBlock(BLANKk,1,1); cumulker<<>>(dx,n); BLANKl; cudaMemcpy(x,dx,n*sizeof(int),cudaMemcpyDeviceToHost); cudaFree(dx); int main(int argc, char **argv) int i; int *x; // host output matrix int n = atoi(argv[1]); int bsize = atoi(argv[2]); x = (int *) malloc(n*sizeof(int)); for (i = 0; i < n; i++) x[i] = rand() if (n < 25) for (i = 0; i < n; i++) printf(\" printf(\"\"); cumul(x,n,bsize); if (n < 25) for (i = 0; i < n; i++) printf(\" printf(\"\"); Solutions: 1. [numbers=left] // CUDA example // inputs an int array x of length nand computes cumulative sums of the // elements, writing them back to x // for this simple illustration, it is assumed that the code runs in // just one block, and that the number of threads evenly divides n #include #include __global__ void cumulker(int *dx, int n) int me = threadIdx.x; int csize = n / blockDim.x; int start = me * csize; int i,j,base; for (i = 1; i < csize; i++) j = start + i; dx[j] = dx[j-1] + dx[j]; __syncthreads(); if (me > 0) base = 0; for (j = 0; j < me; j++) base += dx[(j+1)*csize-1]; __syncthreads(); if (me > 0) for (i = start; i < start + csize; i++) dx[i] += base; int cumul(int *x, int n, int bsize) int *dx; // device copy of x int chunksize = n / bsize; // number of elements handled by each thread cudaMalloc((void **)&dx,n*sizeof(int)); cudaMemcpy(dx,x,n*sizeof(int),cudaMemcpyHostToDevice); dim3 dimGrid(1,1); dim3 dimBlock(bsize,1,1); cumulker<<>>(dx,n); cudaThreadSynchronize(); cudaMemcpy(x,dx,n*sizeof(int),cudaMemcpyDeviceToHost); cudaFree(dx); int main(int argc, char **argv) int i; int *x; // host output matrix int n = atoi(argv[1]); int bsize = atoi(argv[2]); x = (int *) malloc(n*sizeof(int)); for (i = 0; i < n; i++) x[i] = rand() if (n < 25) for (i = 0; i < n; i++) printf(\" printf(\"\"); cumul(x,n,bsize); if (n < 25) for (i = 0; i < n; i++) printf(\" printf(\"\"); ","course":"ECS158"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Unless otherwise stated, give numerical answers as expressions, e.g. . Do NOT use calculators. 1. (35) The code below does an in-place transpose of a square matrix. (Note: No unnecessary computation is done.) Fill in the blanks. [fontsize=-2] #include #include #include __global__ void transpairs(int *m, int n, int nth) int thn = blockIdx.x; // thread number // this thread will handle one below-diagonal element and // its \"mate\" above the diagonal; // first, determine the row and column of // the below-diagonal one int i,j,count=-1, done = 0; for (i=0; i < n-1;i++) for (j=0; j<=i; j++) count++; if (count == thn) done = 1; break; if (done) break; i++; __________________________________________________ int tmp = m[w1]; m[w1] = m[w2]; m[w2] = tmp; int main(int argc, char **argv) int n = atoi(argv[1]); // number of matrix rows/cols int *hm, *dm; int msize = n * n * sizeof(int); hm = (int *) malloc(msize); // as a test, fill matrix with consecutive integers int t = 1,i,j; for (i = 0; i < n; i++) for (j = 0; j < n; j++) hm[i*n+j] = t++; cudaMalloc((void **)&dm,msize); cudaMemcpy(dm,hm,msize,cudaMemcpyHostToDevice); __________________________________________________; dim3 _________________________________________________________; dim3 _________________________________________________________; transpairs<<>>(dm,n,nth); cudaThreadSynchronize(); cudaMemcpy(hm,dm,msize,cudaMemcpyDeviceToHost); if (n < 10) for(int i=0; i #include #define MAX_N 10000000 #define MAX_NODES 10 int nnodes, // number of MPI processes n, // size of x me, // MPI rank of this node // full data for node 0, part for the rest x[MAX_N], // cumulative sums for this node, and // eventually for full array at Node 0 csums[MAX_N], // max values at the various nodes maxvals[MAX_NODES]; int debug; init(int argc, char **argv) // not shown, nothing special here void cumulsums() MPI_Status status; int i,lenchunk,sum,node; // assumed divides evenly lenchunk = n / nnodes; // note that node 0 will participate // in the computation too BLANKa(BLANKb,lenchunk,MPI_INT, BLANKc,lenchunk,MPI_INT, 0,MPI_COMM_WORLD); sum = 0; for (i = 0; i < lenchunk; i++) csums[i] = sum + x[i]; BLANKd; BLANKe(BLANKf,1,MPI_INT,BLANKg,1,MPI_INT, 0,MPI_COMM_WORLD); BLANKh(BLANKi,nnodes,MPI_INT,0,MPI_COMM_WORLD); if (me > 0) sum = 0; for (node = 0; node < BLANKj; node++) sum += BLANKj[node]; for (i = 0; i < lenchunk; i++) csums[i] += sum; BLANKk(csums,lenchunk,MPI_INT,csums,lenchunk,MPI_INT, 0,MPI_COMM_WORLD); int main(int argc,char **argv) // not shown, nothing special here, other // than a call to cumulsums() Solutions: [numbers=left] // finds cumulative sums in the array x #include #include #define MAX_N 10000000 #define MAX_NODES 10 int nnodes, // number of MPI processes n, // size of x me, // MPI rank of this node // full data for node 0, part for the rest x[MAX_N], csums[MAX_N], // cumulative sums for this node maxvals[MAX_NODES]; // the max values at the various nodes int debug; init(int argc, char **argv) int i; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&nnodes); MPI_Comm_rank(MPI_COMM_WORLD,&me); n = atoi(argv[1]); // test data if (me == 0) for (i = 0; i < n; i++) x[i] = rand() debug = atoi(argv[2]); while (debug) ; void cumulsums() MPI_Status status; int i,lenchunk,sum,node; lenchunk = n / nnodes; // assumed to divide evenly // note that node 0 will participate in the computation too MPI_Scatter(x,lenchunk,MPI_INT,x,lenchunk,MPI_INT, 0,MPI_COMM_WORLD); sum = 0; for (i = 0; i < lenchunk; i++) csums[i] = sum + x[i]; sum += x[i]; MPI_Gather(&csums[lenchunk-1],1,MPI_INT, maxvals,1,MPI_INT,0,MPI_COMM_WORLD); MPI_Bcast(maxvals,nnodes,MPI_INT,0,MPI_COMM_WORLD); if (me > 0) sum = 0; for (node = 0; node < me; node++) sum += maxvals[node]; for (i = 0; i < lenchunk; i++) csums[i] += sum; MPI_Gather(csums,lenchunk,MPI_INT,csums,lenchunk,MPI_INT, 0,MPI_COMM_WORLD); int main(int argc,char **argv) int i; init(argc,argv); if (me == 0 && n < 25) for (i = 0; i < n; i++) printf(\" printf(\"\"); cumulsums(); if (me == 0 && n < 25) for (i = 0; i < n; i++) printf(\" printf(\"\"); MPI_Finalize(); ","course":"ECS158"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Group, in-class programming problem: You are given an image in which each pixel is known to be either fully black (1) or fully white (0), and in which the black sections are in the form of nonoverlapping squares. The task is to find all of the squares. You must use either Rdsm or OpenMP. The form of function call, for R/Rdsm, is [fontsize=-2] findsqrs <- function(m) where m is the matrix of pixels. The return value is a three-column matrix, where a row (u,v,w) means that the ``northwest corner'' of a square is at row u, column v, and the width is w. The C/OpenMP form is [fontsize=-2] int *findsqrs(int *m, int mr, int mc) Here mr and mc are the number of rows and columns in m, and the return value is three-column matrix as above. Row and column numbering will increase as one goes down and rightward in the image, starting at (1,1) for R and (0,0) for C. A square could be of size 1x1. As an example (for R), consider this matrix: [fontsize=-2] > m [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 0 0 0 0 0 0 0 0 0 [2,] 0 0 1 1 1 0 0 0 0 [3,] 0 0 1 1 1 0 0 0 0 [4,] 0 0 1 1 1 0 1 1 0 [5,] 0 1 1 0 0 0 1 1 0 [6,] 0 1 1 0 0 0 0 0 0 The function findsqrs() would return the matrix [fontsize=-2] [,1] [,2] [,3] [1,] 2 3 3 [2,] 4 7 2 [3,] 5 2 2 (possibly with the rows permuted). Requirements: A point (u,v) will not be tested for ``NWCness'' (being the northwest corner of a square) by more than one thread. Once a square has been discovered, no thread will test any points within it for NWCness. At the end of the class period, e-mail me your code (all routines in a single file). Of course, make sure all the names of your group members are in comments at the top of the file. If your code doesn't run properly, just send me what you have, with some comments added as to what you would do to try to fix it if you had more time. I am doubtful that the most straightfoward implementation will actually produce a speedup, at least in R. However, your first priority must be to write code that works. If you get it running and would like to submit a second version that seems faster, please do so and you will receive Extra Credit. ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=15mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), combn() etc. IMPORTANT NOTE: All questions refer to CUDA/NVIDIA GPUs. 1. (15) Fill in the blank: Having a lot of threads helps achieve hiding. 2. (15) Suppose the variable n is our problem size, and we wish to check whether the total number of threads evenly divides this quantity. Write one line of code, to be run on the device, that sets the bool variable evenlydivides (assumed previously declared) to true if this condition holds, false otherwise. Assume we only use the ``x dimension'' in grids and blocks. 3. Consider the mutual-outlinks problem in Section 5.8, but changed as follow: Line 14 is now i = me; Line 19 is now blank. Variables such as totth will now be ignored. (They would be removed, but let's keep the specs simple here.) Answer the questions below. Assume the program won't ever be run with any partially-filled blocks; the total number of threads will be evenly divisible by the number of blocks. [(a)] (20) One more line in the program would need to be changed. State which one, and what the new version of the line would look line. Note! Put your answer on just ONE line in your submitted electronic file. Sample answer line: change line 45 to: i = 168; [(b)] (15) Which line, already a drain on execution speed, would become even more of a drain, and why? 4. (20) Consider static versus dynamic scheduling of loop operations. Our CUDA examples, such as Line 14 in the mutual-outlink example (original version, not changed as in Problem 3), have used static scheduling. Explain in one line why dynamic scheduling generally would not be a good choice for our GPU programming. 5. (15) Consider the primes-finding code in Section 5.9, specifically the function sieve(). This question concerns the issue of possible bank conflicts between thread 0 and thread 1. Fill in the blank with a mathematical condition involving chunk: There will be no bank conflicts as long as . Solutions: 1. latency 2. evenlydivides = (n 3.a change line 50 to: dim3 dimGrid(n/192,1); 3.b Line 20. It would be executed more often, thus more of a drain. 4. It would be difficult to implement dynamic scheduling without inducing a large degree of thread divergence. Also, dynamic scheduling requires atomic access to a work queue, which is slow on our GPUs. 5. In Line 74 the two threads will be simultaneously accessing items that are chunks words apart in memory. Those will be in separate banks as long as chunk 32 is not 0. ","course":"ECS158"} {"quiz":"Directions: Work only on this sheet. Put all your ``fill the blank'' answers in one place, say the lower-right part of this side, or on the back. Format: [fontsize=-2] #1a. x+y #1b if(u > v) w = 3; ... MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. Below is an MPI version of the bucket sort in our OpenMP example (except that the bin boundaries are assumed known ahead of time, rather than calculated from a sample). The message-passing strategy is outline in the comments at the beginning of the code. Fill in the blanks. [fontsize=-2,numbers=left] // bucket sort, bin boundaries known in advance // node 0 is manager, all else worker nodes; node 0 sends full data, bin // boundaries to all worker nodes; i-th worker node extracts data for // bin i-1, sorts it, sends sorted chunk back to node 0; node 0 places // sorted results back in original array // not claimed efficient; e.g. could be better to have manager place // items into bins #include #define MAX_N 100000 // max size of original data array #define MAX_NPROCS 100 // max number of MPI processes #define DATA_MSG 0 // manager sending original data #define BDRIES_MSG 0 // manager sending bin boundaries #define CHUNKS_MSG 2 // workers sending their sorted chunks int nnodes, // n, // size of full array me, // my node number fulldata[MAX_N], tmp[MAX_N], nbdries, // number of bin boundaries counts[MAX_NPROCS]; float bdries[MAX_NPROCS-2]; // bin boundaries int debug,debugme; init(int argc, char **argv) int i; debug = atoi(argv[3]); debugme = atoi(argv[4]); MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&nnodes); MPI_Comm_rank(MPI_COMM_WORLD,&me); nbdries = nnodes - 2; n = atoi(argv[1]); int k = atoi(argv[2]); // for random # gen // generate random data for test purposes for (i = 0; i < n; i++) fulldata[i] = rand() // generate bin boundaries for test purposes for (i = 0; i < nbdries; i++) bdries[i] = i * (k+1) / ((float) nnodes); void managernode() MPI_Status status; int i; int lenchunk; // length of a chunk received from a worker // send full data, bin boundaries to workers for (i = 1; i < nnodes; i++) MPI_Send(BLANKa,BLANKb,MPI_INT,BLANKc,DATA_MSG,MPI_COMM_WORLD); MPI_Send(BLANKd,BLANKe,MPI_FLOAT,BLANKf,BDRIES_MSG,MPI_COMM_WORLD); // collect sorted chunks from workers, place them in their proper // positions within the original array int currposition = 0; for (i = 1; i < nnodes; i++) MPI_Recv(tmp,MAX_N,MPI_INT,BLANKg,CHUNKS_MSG,MPI_COMM_WORLD,&status); MPI_Get_count(&status,MPI_INT,BLANKh); // memcpy(d,s,nb) copies nb bytes from s to d memcpy(BLANKi); BLANKj; if (n < 25) for (i = 0; i < n; i++) printf(\" printf(\"\"); // adds xi to the part array, increments npart, the length of part void grab(int xi, int *part, int *npart) part[*npart] = xi; *npart += 1; int cmpints(int *u, int *v) if (*u < *v) return -1; if (*u > *v) return 1; return 0; void getandsortmychunk(int *tmp, int n, int *chunk, int *lenchunk) int i,count = 0; int workernumber = me - 1; for (i = 0; i < n; i++) if (workernumber == 0) if (tmp[i] <= bdries[0]) grab(tmp[i],chunk,&count); else if (workernumber < nbdries-1) if (tmp[i] > bdries[workernumber-1] && tmp[i] <= bdries[workernumber]) grab(tmp[i],chunk,&count); else if (tmp[i] > bdries[nbdries-1]) grab(tmp[i],chunk,&count); qsort(chunk,count,sizeof(int),cmpints); *lenchunk = count; void workernode() int n,fulldata[MAX_N], // size and storage of full data chunk[MAX_N], lenchunk, nbdries; // number of bin boundaries float bdries[MAX_NPROCS-1]; // bin boundaries MPI_Status status; MPI_Recv(fulldata,MAX_N,MPI_INT,BLANKk,DATA_MSG,MPI_COMM_WORLD,&status); MPI_Get_count(&status,MPI_INT,BLANKl); MPI_Recv(bdries,MAX_NPROCS-2,MPI_FLOAT,BLANKm,BDRIES_MSG, MPI_COMM_WORLD,&status); MPI_Get_count(&status,MPI_FLOAT,BLANKn); getandsortmychunk(fulldata,n,chunk,&lenchunk); MPI_Send(chunk,lenchunk,MPI_INT,BLANKo,CHUNKS_MSG,MPI_COMM_WORLD); int main(int argc,char **argv) int i; init(argc,argv); if (me == 0) managernode(); else workernode(); MPI_Finalize(); Solutions: [fontsize=-2,numbers=left] // bucket sort, bin boundaries known in advance // node 0 is manager, all else worker nodes; node 0 sends full data, bin // boundaries to all worker nodes; i-th worker node extracts data for // bin i-1, sorts it, sends sorted chunk back to node 0; node 0 places // sorted results back in original array // not claimed efficient; e.g. could be better to have manager place // items into bins #include #define MAX_N 100000 // max size of original data array #define MAX_NPROCS 100 // max number of MPI processes #define DATA_MSG 0 // manager sending original data #define BDRIES_MSG 0 // manager sending bin boundaries #define CHUNKS_MSG 2 // workers sending their sorted chunks int nnodes, // n, // size of full array me, // my node number fulldata[MAX_N], tmp[MAX_N], nbdries, // number of bin boundaries counts[MAX_NPROCS]; float bdries[MAX_NPROCS-2]; // bin boundaries int debug,debugme; init(int argc, char **argv) int i; debug = atoi(argv[3]); debugme = atoi(argv[4]); MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&nnodes); MPI_Comm_rank(MPI_COMM_WORLD,&me); nbdries = nnodes - 2; n = atoi(argv[1]); int k = atoi(argv[2]); // for random # gen // generate random data for test purposes for (i = 0; i < n; i++) fulldata[i] = rand() // generate bin boundaries for test purposes for (i = 0; i < nbdries; i++) bdries[i] = i * (k+1) / ((float) nnodes); void managernode() MPI_Status status; int i; int lenchunk; // length of a chunk received from a worker // send full data, bin boundaries to workers for (i = 1; i < nnodes; i++) MPI_Send(fulldata,n,MPI_INT,i,DATA_MSG,MPI_COMM_WORLD); MPI_Send(bdries,nbdries,MPI_FLOAT,i,BDRIES_MSG,MPI_COMM_WORLD); // collect sorted chunks from workers, place them in their proper // positions within the original array int currposition = 0; for (i = 1; i < nnodes; i++) MPI_Recv(tmp,MAX_N,MPI_INT,i,CHUNKS_MSG,MPI_COMM_WORLD,&status); MPI_Get_count(&status,MPI_INT,&lenchunk); memcpy(fulldata+currposition,tmp,lenchunk*sizeof(int)); currposition += lenchunk; if (n < 25) for (i = 0; i < n; i++) printf(\" printf(\"\"); // adds xi to the part array, increments npart, the length of part void grab(int xi, int *part, int *npart) part[*npart] = xi; *npart += 1; int cmpints(int *u, int *v) if (*u < *v) return -1; if (*u > *v) return 1; return 0; void getandsortmychunk(int *tmp, int n, int *chunk, int *lenchunk) int i,count = 0; int workernumber = me - 1; if (me == debugme) while (debug) ; for (i = 0; i < n; i++) if (workernumber == 0) if (tmp[i] <= bdries[0]) grab(tmp[i],chunk,&count); else if (workernumber < nbdries-1) if (tmp[i] > bdries[workernumber-1] && tmp[i] <= bdries[workernumber]) grab(tmp[i],chunk,&count); else if (tmp[i] > bdries[nbdries-1]) grab(tmp[i],chunk,&count); qsort(chunk,count,sizeof(int),cmpints); *lenchunk = count; void workernode() int n,fulldata[MAX_N], // size and storage of full data chunk[MAX_N], lenchunk, nbdries; // number of bin boundaries float bdries[MAX_NPROCS-1]; // bin boundaries MPI_Status status; MPI_Recv(fulldata,MAX_N,MPI_INT,0,DATA_MSG,MPI_COMM_WORLD,&status); MPI_Get_count(&status,MPI_INT,&n); MPI_Recv(bdries,MAX_NPROCS-2,MPI_FLOAT,0,BDRIES_MSG,MPI_COMM_WORLD,&status); MPI_Get_count(&status,MPI_FLOAT,&nbdries); getandsortmychunk(fulldata,n,chunk,&lenchunk); MPI_Send(chunk,lenchunk,MPI_INT,0,CHUNKS_MSG,MPI_COMM_WORLD); int main(int argc,char **argv) int i; init(argc,argv); if (me == 0) managernode(); else workernode(); MPI_Finalize(); ","course":"ECS158"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (50) The OpenMP code below implements what might be considered a variant of k-means clustering. It is assumed that once a data point is placed into a cluster, it stays with that cluster even as new data points are added. The number of clusters is fixed, but the centroids and counts of cluster members are updated each time a new data point is acquired. Assume that new data arrives in clumps. The code below takes a clump of new data points and updates the cluster centroids and counts. (Which must be updated once for each new data point.) Globals: k, the number of clusters p, the dimensionality of the space n, the total number of data points recorded in clusters (will grow by the amount of nnew below) centroids, a matrix of the current centroids, k rows, p columns clstrcounts, an array, length k, recording how many data points are in each cluster grps, an array listing group membership, so that for example grps[88] = 3 means that data point number 88 is in cluster 3; length is assumed as large as n will ever get nnew, the number of new data points clump, matrix of the new data, with nnew rows, p columns Fill in the blanks, and add any lines necessary. For the latter action, write something like, ``Place the following code between lines 8 and 9.'' Do NOT delete or change lines. [fontsize=-2,numbers=left] #pragma omp parallel int i,j,grpnum; for (i = 0; i < nnew; i++) // function closest() (not shown), finds the index of // the closest centroid to the new data point i grpnum = closest(i); for (j = 0; j < p; j++) tmp = centroid[grpnum][j] _______________________________; tmp += clump[i][j]; tmp /= ________________________________ 2. (50) The CUDA code below computes the discrete cosine transform of an image, p.135. Assume there is only one block, with that block consisting of n rows and m columns of threads. Each thread handles a single pixel, making a local copy. Shared memory is not used. The arguments to dct are: n, the number of rows in the image and the transform m, the number of columns in the image and the transform dx, the image data on the device, n rows, m columns dd, the transform data on the device, n rows, m columns, initially all 0.0 [fontsize=-2,numbers=left] __global__ void dct(float *dx,int n, int m, float *dd) int j,k; float pi = 3.14; for (u = 0; u < n; u++) for (v = 0; v < m; v++) float y(int q) if (q == 0) return 0.71; else return 1.0; Solutions: 1. [fontsize=-2,numbers=left] #pragma omp parallel int i,j,grpnum; #pragma omp for for (i = 0; i < nnew; i++) // function closest() (not shown), finds the index of // the closest centroid to the new data point i grpnum = closest(i); #pragma omp critical for (j = 0; j < p; j++) tmp = clstrcounts[grpnum] * centroid[grpnum][j]; tmp += clump[i][j]; tmp /= (clstrcounts[grpnum]+1); centroids[grpnum][j] = tmp; clstrcounts[grpnum]++; n++; grps[n] = grpnum; 2. [fontsize=-2,numbers=left] __global__ void dct(float *dx,int n, int m, float *dd) int u,v; int j = threadIdx.x; int k = threadIdx.y; float pi = 3.14, myx, tmp; myx = dx[n*j+k]; for (u = 0; j < n; u++) for (v = 0; k < m; v++) tmp = myx * cos((2*j+1)*u*pi/(2*n)) + cos((2*k+1)*v*pi/(2*m)); tmp /= y(u) * y(v) * 2 / sqrt(m*n); atomicAdd(&dd[n*u+v],tmp); float y(int q) if (q == 0) return 0.71; else return 1.0; ","course":"ECS158"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. In the MPI code that finds prime numbers in a pipelined manner, let's say we measure work by the count of numbers checked by each node. (In the case of node 0, the even numbers won't count as being checked.) [(a)] () Find the approximate ratio of the work done by node 0 and that of node 1, for large N. [(b)] () Fill in the blank with a term from our course: The fact that the ratio in (a) is not near 1.0 shows that we have a problem with . 2. () This problem also concerns the pipelined prime finding MPI code, in this case an altered version of it. Here we allow for a general number of nodes, rather than 3. Part of the code in main() changes to [fontsize=-2] if (Me == 0) Node0(); else if (Me == NNodes-1) NodeEnd(); else NodeBetween(); Fill in the blanks for the code in NodeBetween(): [fontsize=-2,numbers=left] void NodeBetween() int ToCheck,Dummy,Divisor; MPI_Status Status; // put 1 statement here while (1) // put 1 statement here if (Status.MPI_TAG == END_MSG) break; if (ToCheck // put 1 statement here // put 1 statement here Solutions: 1. 3/2; load balancing 2. [fontsize=-2] void NodeBetween() int ToCheck,Dummy,Divisor; MPI_Status Status; // first received item gives us our prime divisor // receive into Divisor 1 MPI integer from node Me-1, of any message // type, and put information about the message in Status MPI_Recv(&Divisor,1,MPI_INT,Me-1,MPI_ANY_TAG,MPI_COMM_WORLD,&Status); while (1) MPI_Recv(&ToCheck,1,MPI_INT,Me-1,MPI_ANY_TAG,MPI_COMM_WORLD,&Status); // if the message type was END_MSG, end loop if (Status.MPI_TAG == END_MSG) break; if (ToCheck MPI_Send(&ToCheck,1,MPI_INT,Me+1,PIPE_MSG,MPI_COMM_WORLD); MPI_Send(&Dummy,1,MPI_INT,Me+1,END_MSG,MPI_COMM_WORLD); ","course":"ECS158"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (25) Name the OpenMP functions analogous to MPICommsize() and MPICommrank(). 2. (25) Suppose OpenMP did not include the single pragma. Rewrite lines 91-92, p.51, without using that pragma. Keep it short! It just has to work, not necessarily be optimal. 3. (25) (Canceled.) 4. (25) Suppose we have the system on p.23, with low-order interleaving, and we have a long array x. Say it takes one clock cycle to move a message one hop in the network, e.g. from one diamond to the one above it, and messages to access x[i] and x[j] leave P2 and P1 at times 0 and 1, respectively. Then give a mathematical necessary and sufficient condition for there to be a ``collision'' between the two messages, i.e. one will delay the other at some diamond. Express your answer in math, not English. Any math symbol can be used, including ones made from letters such as cos. Solutions: 1. [fontsize=-2] omp_get_num_threads() omp_get_thread_num() 2. [fontsize=-2] if (me == 0) md = largeint; mv = 0; #pragma omp barrier 4. ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), combn() etc. 1. There are various high-level threads access systems other than OpenMP. One of them is a language called Cilk++, developed at MIT and purchased by Intel. Judging from the names of the Cilk++ constructs below, give names of OpenMP or pthreads constructs that should roughly correspond. [(a)] (5) cilkfor [(b)] (5) cilk::reduceropadd [(c)] (5) cilk::mutex [(d)] (5) cilkspawn (a dictionary definition of spawn: ``to produce or create something'') 2. (10) Explain briefly why R's snow library would be a poor choice---essentially an impossible one---for pipelined parallel algorithms such as in our MPI example, Sec. 1.3.3.2. 3. Consider the in-place matrix transposition code in Sec. 4.3.4. [(a)] (10) Fill in the blank: Since each thread works on completely separate elements of the matrix, there ``should'' not be a lot of cache coherency transactions. But there probably will be, due to the problem of . [(b)] (10) Give a potential improvement to one (1) line of the code. 4. (10) Consider the Dijkstra example, pp.71ff. Suppose that in the end, shortest distances from vertex 0 to vertex i are roughly correlated with i, i.e. vertices with larger values of i tend to be further from 0. Comment on performance issues that would likely arise. Cite certain variables and/or lines so that it is clear that you understand the issues, but be brief. As usual, you are limited to a single, hopefully not very long, line. 5. (40) The function below is written in R, but could be applied to any scheduling situation; it is merely an analytical tool. The call form is statictime(tasktms,nth), where we have nth threads, and tasktms are the task times, assumed here to be known in advance. Say for instance we have 6 tasks, needing times tasktms[1] through tasktms[6], and 2 threads. We would have some kind of parallel for loop, iterating i through 1 to 6; one thread would handle some values of i, and the other thread would handle the others. Here we assume the static scheduling algorithm used by OpenMP, and the function will return the time needed to complete all the tasks. Fill in the blanks. 1in statictime <- function(tasktms,nth) n <- length(tasktms) a1ton <- 1:n endtimes <- vector(length=nth) for (i in 1:nth) # determine which tasks thread i will handle if (i != nth) this1does <- # blank (a) else this1does <- # blank (b) # blank (c) # blank (d) Solutions: 1.a OMP for pragma 1.b OMP reduction clause, with '+' 1.c pthreadmutexlock() 1.d pthreadcreate() 2. It would be impossible to get parallelism this way, without direct communication between the workers. 3.a false sharing 3.b One could try special scheduling, say dynamic, on line 12. 4. Entry of the vertices into the nondone array will roughly occur in order of i, so that soon the low-numbered threads have little or no work to do in line 54. 5. The problem was in part incorrectly specified, as it assumed (without saying so) a chunk size of 1. The code below is written under that assumption. In actuality, the default for static scheduling is to divide the iterations in approximately equal-sized chunks. In our situation here, we could fill blank (a) with ((i-1) * floor(n/nth) + 1):(i * floor(n/nth) and do something similar for blank (b). This does not affect the answers to blanks (c) and (d). statictime <- function(tasktms,nth) n <- length(tasktms) a1ton <- 1:n endtimes <- vector(length=nth) for (i in 1:nth) # determine which tasks thread i will handle if (i != nth) this1does <- which(a1ton else this1does <- which(a1ton endtimes[i] <- sum(tasktms[this1does]) max(endtimes) source(\"umul.R\") c2 <- makeCluster(2) ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=15mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (10) There is a algorithm for sorting known as even/odd transposition sort. At each iteration, each array element is swapped with either its left or right neighbor. (More commonly, contiguous chunks in the array are swapped.) If this algorithm were implemented in MPI, what MPI function would be especially appropriate? 2. (15) Consider the Thrust matrix transpose example in Section 6.7.1. State the contents of dmap just before the execution of the scatter operation in line 42. 3. (15) Consider the MPI mutual-outlinks example in Section 8.6.8. Suppose Line 46 were to be changed to i = me; (and Line 51 would become blank). Write a single line of code---and state between which two lines it should be inserted---that does a check for whether the program will still work, and prints out a warning if not. 4. (60) The code below computes the transpose of an n x n matrix (presumably very large), to be done in streaming mode in Hadoop. It is assumed that the input matrix is as in Section 9.10, stored one matrix row per line of the file, with a row numbers column prepended on the left and with spaces used as the delimiter. The output matrix will have the same form, except of course without the prepended column. So, for instance, an input matrix 1 1 2 6 2 3 4 8 3 0 5 12 will produce output 1 3 0 2 4 5 6 8 12 Fill in the blanks. Assume that there will be only 1 reducer and 1 output file. mapper: #!/usr/bin/env Rscript con <- file(\"stdin\", open = \"r\") repeat line <- readLines(con,n=1) # read 1 line if (length(line) == 0) break tks <- strsplit(line,split=\" \") tks <- tks[[1]] i <- as.integer(tks[1]) elts <- tks[-1] n <- length(elts) for (j in 1:n) newpos <- blank (a) blank (b) reducer: #!/usr/bin/env Rscript # get n from command line args <- commandArgs(T) n <- as.integer(args[1]) con <- file(\"stdin\", open = \"r\") # make vector of n integers arow <- integer(n) for (blank (c)) for (blank (d)) line <- readLines(con,n=1) line <- strsplit(line,split=\"\") line <- line[[1]] blank (e) blank (f) Solutions: 1. MPISendrecv() 2. [0, 2, 4, 1, 3, 5] 3. Insert, say just before Line 62: if (nnodes < n) printf(\"not enough MPI processes\"); 4. mapper: #!/usr/bin/env Rscript # map/reduce pair inputs rows of a square matrix, and emits one record # for each element of the matrix con <- file(\"stdin\", open = \"r\") repeat line <- readLines(con,n=1) # read 1 line if (length(line) == 0) break tks <- strsplit(line,split=\" \") tks <- tks[[1]] i <- as.integer(tks[1]) elts <- tks[-1] # print(elts) n <- length(elts) for (j in 1:n) newpos <- (j-1) * n + i # print(newpos) cat(newpos,\"\",elts[j],\"\") reducer: #!/usr/bin/env Rscript # get n from command line args <- commandArgs(T) n <- as.integer(args[1]) con <- file(\"stdin\", open = \"r\") arow <- integer(n) for (lineout in 1:n) for (j in 1:n) line <- readLines(con,n=1) line <- strsplit(line,split=\"\") line <- line[[1]] arow[j] <- line[2] cat(arow,\"\") 4. # arguments: # a: adjacency matrix # lnks: edges matrix; shared, nrow(a)^2 rows and 2 columns # counts: numbers of edges found by each thread; shared # in this version, the matrix lnks must be created ahead of time; since # the number of rows is uknown a priori, one must allow for the worst # case, nrow(a)^2 rows; after the run, the number of actual rows will be # in counts[1,length(cls)] getlinksthread <- function(a,lnks,counts) require(parallel) nr <- nrow(a) # get my assigned portion of a myidxs <- getidxs(nr) myout <- apply(a[myidxs,],1,function(rw) which(rw==1)) # myout[[i]] now lists the edges from node myidxs[1] + i - 1 nmyedges <- Reduce(sum,lapply(myout,length)) # my total edges me <- myinfo ","course":"ECS158"} {"quiz":" xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. IMPORTANT NOTE: If you believe that nothing needs to be placed into a blank, simply give Nothing as your answer in your file. If you do not answer at all, put 00 in your file. 1. (70) The following Hadoop code finds the product ax, where a is a (presumably very large) matrix, and x is a column vector x. The input matrix (read as stdin) consists of a together with a prepended column of row numbers. The vector x is defined in the file x.py, via code in that file; x is thus defined by executing code in a file x.py, via a Python import statement. So, if the input matrix is 0 1 2 0 1 5 8 -4 2 0 0 3 and the contents of x.py are x = [5,12,13] then the final output will be 29 69 39 Note: There are no row/element numbers in the final output. Also, don't worry about leading blanks in the output. Fill in the blanks. You may find the Python len() function useful; it returns the length of a Python list (array), so that for instance len(x) is 3 in the above example. Also, the int() function is like atoi() in C. axmap.py: [numbers=left] #!/usr/bin/env python from x import x # x is input here from file x.py import sys for line in sys.stdin: tks = line.split() rownum = tks[0] row = tks[1:] sum = 0 for i in range(BLANKa): sum += BLANKb print BLANKc axred.py: [numbers=left] #!/usr/bin/env python import sys for line in sys.stdin: line = line.strip() tks = line.split(' print BLANKd 2. (30) Fill in the blanks in the Snow code below, which finds the unique elements of an array in parallel. The built-in R function unique() works like this: [numbers=left] > x <- sample(1:8,10,replace=T) > x [1] 4 7 3 1 1 2 3 3 2 8 > unique(x) [1] 4 7 3 1 2 8 Code: [numbers=left] # not claimed efficient, and # no guarantee of ordering in result parunique <- function(cls,x) parts <- clusterSplit(cls,1:length(x)) xparts <- lapply(parts,function(part) x[part]) tmp <- clusterApply(cls,xparts,BLANKa) tmp <- Reduce(BLANKb) BLANKc Solutions: 1. axmap.py: [numbers=left] #!/usr/bin/env python from x import x # input x from file x.py import sys for line in sys.stdin: tks = line.split() rownum = tks[0] row = tks[1:] sum = 0 for i in range(len(row)): sum += int(row[i]) * x[i] print ' axred.py: [numbers=left] #!/usr/bin/env python import sys for line in sys.stdin: line = line.strip() tks = line.split(' print tks[1] 2. [numbers=left] # not claimed efficient, and no guarantee of ordering in result parunique <- function(cls,x) parts <- clusterSplit(cls,1:length(x)) xparts <- lapply(parts,function(part) x[part]) tmp <- clusterApply(cls,xparts,unique) tmp <- Reduce(c,tmp) unique(tmp) ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (45) The code below counts primes, in a manner similar to our pthreads example. The function crossout(), not shown, is the same as in that example, except for the obvious new arguments. Fill in the blanks. [numbers=left] // OMP code program to find the number // of primes between 2 and n; not // claimed to be efficient int primecounter(int n) int *prime; int nextbase=3,tot=0; blank (a) int nth = omp_get_num_threads(); int me = omp_get_thread_num(); int i,base; blank (b) blank (c) for (i = 3; i <= n; i += 2) prime[i] = 1; while(1) blank (d) base = nextbase; nextbase += 2; if (base > sqrt(n)) break; if (prime[base]) crossout(prime,n,base); blank (e) int mytot = 0; blank (f) for (i = 3; i < n; i += 2) mytot += prime[i]; blank (g) blank (h) return blank (i) 2. (45) Here we use R snow to count primes. We break the vector 3,5,7,9,... into chunks, one chunk for each cluster node, and have each cluster node count primes in its chunk. To do the latter, we divide by all the numbers in divvec, which is a vector of all the primes up through , which we find serially. For instance, say is 1000. Then we find the primes up to , which turn out to be 2,3,5,7,11,13,17,19,23,29,31; those 11 numbers form divvec, which we apply to finding primes up through 1000. Fill in the blanks. [numbers=left] # parallel prime counter, not claimed efficient # serial prime finder; can be used to generate # divisor vector serprime <- function(n) ... # apply divvec to one chunk of the prime vector, # return count of primes there processchunk <- function(primechunk,divvec) count <- 0 for (i in primechunk) # note blank (a)! if (all(i count <- count + 1 count primecount <- function(cls,n) # generate the vector 3,5,7,9..., through n prime <- seq(3,n,2) # serially find the primes up through sqrt(n) divvec <- serprime(ceiling(sqrt(n))) # remove those from our prime vector prime <- setdiff(prime,divvec) # break prime vector into chunks ixchunks <- blank (b) getchunk <- function(ixchunk) blank (c) primechunks <- Map(getchunk,ixchunks) # send those chunks to the cluster nodes, # calling processchunk() at each node counts <- blank (d) # put it all together blank (e) 3. (10) Consider the pipelined prime-finding MPI code, pp.18ff. Say we have 10 nodes. Fill in the blanks: Node 8 will spend blank (a) time in line blank (b) than will Node 1. Solutions: 1. #include #include #include // required for threads usage // OMP code program to find the number of primes between 2 and n; not // claimed to be efficient void crossout(int *prime, int n, int k) int i; for (i = 3; i*k <= n; i += 2) prime[i*k] = 0; int primecounter(int n) int *prime; int nextbase=3,tot=0; #pragma omp parallel int nth = omp_get_num_threads(); int me = omp_get_thread_num(); int i,base; #pragma omp single prime = malloc((n+1)*sizeof(int)); for (i = 3; i <= n; i += 2) prime[i] = 1; while(1) #pragma omp critical base = nextbase; nextbase += 2; if (base > sqrt(n)) break; if (prime[base]) crossout(prime,n,base); #pragma omp barrier int mytot = 0; #pragma omp for for (i = 3; i < n; i += 2) mytot += prime[i]; #pragma omp critical tot += mytot; return tot + 1; main(int argc, char **argv) int n = atoi(argv[1]); printf(\" 2. # Snow code to count primes through n; # not efficient # serial prime finder; can be used to generate # divisor list serprime <- function(n) nums <- 1:n # all in nums assumed prime until shown otherwise prime <- rep(1,n) maxdiv <- ceiling(sqrt(n)) for (d in 2:maxdiv) # don't bother dividing by nonprimes if (prime[d]) # try divisor d on numbers not yet # found nonprime tmp <- prime !=0 & nums > d & nums prime[tmp] <- 0 nums[prime != 0 & nums >= 2] # apply divvec to one chunk of the prime vector, # return count of primes theree processchunk <- function(primechunk,divvec) count <- 0 for (i in primechunk) if (all(i count <- count + 1 count primecount <- function(cls,n) # generate the vector 3,5,7,9..., through n prime <- seq(3,n,2) # serially find the primes up through sqrt(n) divvec <- serprime(ceiling(sqrt(n))) # remove those from our prime vector prime <- setdiff(prime,divvec) # break prime vector into chunks ixchunks <- splitIndices(length(prime),length(cls)) getchunk <- function(ixchunk) prime[ixchunk] primechunks <- Map(getchunk,ixchunks) # send those chunks to the cluster nodes, # calling processchunk() at each node counts <- clusterApply(cls,primechunks, processchunk,divvec) # put it all together Reduce(sum,counts) + length(divvec) 3. As we go deeper into the pipe, each node has less work to do. That means the later nodes wait more time from one data receipt to the next, i.e. more time on line 83. ","course":"ECS158"} {"quiz":" xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Do NOT turn in this sheet of paper (unless you lack a laptop or have a laptop failure during the Exam). You will submit electronic files to handin. INSTRUCTIONS FOR SUBMISSION: Submit to the CSIF handin, under my account, directory 158quiz7 using the alphabetically earliest UCD e-mail address among your group members. Submit a file named Who.txt that lists the UCD e-mail addresses (without the @ucdavis.edu) of your group members, one per line. Submit ONLY the files transpose.c (Problem 1, no main()) and Transgraph.cpp (Problem 2, including main()). Get your files into handin by 2 minutes after the exam. After that, you may be penalized. You will receive a 10-point bonus if you comply fully with the specs. 1. (50) Here you will write an OpenMP program to do matrix transpose, with the following specs: The matrix is square, nxn, with int entries. The transposition is done in-place. Do not create any auxiliary arrays. The signature of your function must be void transp(int *m, int n) Here is my test code: #include #include void printmat(int *m, int n) int i,j,k=0; for (i = 0; i < n; i++) for (j = 0; j < n; j++) printf(\" printf(\"\"); int main(int argc, char **argv) int i; int n = atoi(argv[1]), n2 = n*n; int *testm = malloc(n2*sizeof(int)); for (i = 0; i < n2; i++) testm[i] = rand() printmat(testm,n); transp(testm,n) printmat(testm,n); My test script will run the commands gcc transposemain.c transpose.c -fopenmp setenv OMP_NUM_THREADS 3 a.out 5 and will expect the output 7 6 9 3 1 15 10 12 9 13 10 11 2 11 3 6 12 2 4 8 11 8 7 13 6 7 15 10 6 11 6 10 11 12 8 9 12 2 2 7 3 9 11 4 13 1 13 3 8 6 2. (50) Consider the adjacency graph transformation method we've seen before. You will write Thrust code to transform an adjacency matrix for a directed graph to an equivalent but different, two-column form. In each instance in which element (i,j) is 1, indicating a link from i to j, the transformed matrix has a row consisting of (i,j). Here are the details: Your code will fill the blanks in the following: LARGE BLANK int main(int argc, char **argv) int x[12] = 0,1,1,0, 1,0,0,1, 1,1,0,0; int nr=3,nc=4; LARGE BLANK for (i = 0; i < n1s; i++) printf(\" Here newmat is a Thrust array you'll create. My script will run the commands g++ -g -O2 Transgraph.cpp -fopenmp -I/usr/local/cuda/include -DTHRUST_DEVICE_BACKEND=THRUST_DEVICE_BACKEND_OMP setenv OMP_NUM_THREADS 3 a.out and will expect the output 0 1 0 2 1 0 1 3 2 0 2 1 Your code must work for general nr and nc. Remember, you will submit an entire program, including the parts of main() given above. Recommended approach: Keep in mind that both your input and output arrays are one-dimensional, even though we are storing matrices. First, use Thrust to determine the indices of the 1s in the input array. Then use Thrust to fill in the contents of the output array. Solutions: 1. [numbers=left] #include // translate from 2-D to 1-D indices int onedim(int n,int i,int j) return n * i + j; void transp(int *m, int n) #pragma omp parallel int i,j,tmp; // walk through all the above-diagonal elements, swapping them // with their below-diagonal counterparts #pragma omp for for (i = 0; i < n; i++) for (j = i+1; j < n; j++) tmp = m[onedim(n,i,j)]; m[onedim(n,i,j)] = m[onedim(n,j,i)]; m[onedim(n,j,i)] = tmp; 2. [numbers=left] // transgraph problem, using Thrust #include #include #include #include #include // forms one row of the output matrix struct makerow const thrust::host_vector::iterator outmat; const int nc; // number of columns makerow(thrust::host_vector::iterator _outmat,int _nc) : outmat(_outmat), nc(_nc) __host__ __device__ // the j-th 1 is in position i of the orig matrix bool operator()(const int i, const int j) outmat[2*j] = i / nc; outmat[2*j+1] = i ; int main(int argc, char **argv) int x[12] = 0,1,1,0, 1,0,0,1, 1,1,0,0; int nr=3,nc=4,nrc = nr*nc,i; thrust::host_vector hx(x,x+nrc); thrust::host_vector seq(nrc); thrust::sequence(seq.begin(),seq.end(),0); thrust::host_vector ones(x,x+nrc); // get 1-D indices of the 1s thrust::host_vector::iterator newend = thrust::copy_if(seq.begin(),seq.end(),hx.begin(),ones.begin(), thrust::identity()); int n1s = newend - ones.begin(); thrust::host_vector newmat(2*n1s); thrust::host_vector out(n1s); thrust::host_vector seq2(n1s); thrust::sequence(seq2.begin(),seq2.end(),0); thrust::transform(ones.begin(),newend,seq2.begin(),out.begin(), makerow(newmat.begin(),nc)); for (i = 0; i < n1s; i++) printf(\" ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (100) This problem implements the Mandelbrot computation on p.28 of our book. The idea is to generate the rectangular grid of points, then for each grid point, determine whether the sequence generated in Equation (2.1) in our text remains bounded after a specified number of iterations. Our definition of ``bounded'' will be that all the points in the sequence are within the disk of radius 2.0 with center (0,0). The function mandelbrot() will do the main computation, returning an R list that is been designed to feed in to the R image() function. We will break the computation into chunks, and use R Snow to parallelize. Say our X-axis interval is (3,4), while for Y it is (2,2.4), with an increment 0.2 between tick marks on the axes. The marks on X will be 3,3.2,...,4 and on Y they will be at 2,2.2,2.4. That's 6 marks on X and 3 on Y, for a total of 18 grid points. Note: > seq(1.2,2.2,0.3) [1] 1.2 1.5 1.8 2.1 Fill in the blanks: # arguments: # xl: left limit # xr: right limit # yt: top limit # yb: bottom limit # inc: distance between ticks on X, Y axes # maxiters: maximum number of iterations # return value: # we aim to return an R list which will be # used as input to R's image() function: # a matrix of 1s and 0s, 1 meaning sequence # remains bounded, absolute value <= 2, # plus the tick marks vectors for the X # and Y axes mandelbrot <- function(xl,xr,yb,yt,inc,maxiters) # determine where the tick marks go on # the X and Y axes xticks <- seq(xl,xr,inc) yticks <- seq(yb,yt,inc) nxticks <- length(xticks) nyticks <- length(yticks) m <- matrix(0,nrow=nxticks,ncol=nyticks) for (i in 1:nxticks) xti <- xticks[i] for (j in 1:nyticks) ytj <- yticks[j] # this is c in (2.1), but don't # confuse with R's c() ftn cpt <- complex(real=xti,imaginary=ytj) z <- cpt for (k in 1:maxiters) blank (a) if ( blank (b) ) break blank (c) list(x=xticks,y=yticks,z=m) # R Snow wrapper, divide grid into strips, # then call mandelbrot() on each one; we won't # mind if there are a few duplicate pixels in the # end mandelsnow <- function(cls,xl,xr,yb,yt,inc,maxiters) clusterExport(cls,c(\"xl\",\"xr\", \"yb\",\"yt\",\"inc\",\"maxiters\"), envir=environment()) ncls <- length(cls) clusterExport(cls, c(\"mandelbrot\",\"ncls\"), envir=environment()) getmyxinterval <- function(i) width <- (xr - xl) / ncls myxl <<- blank (d) myxr <<- blank (e) blank (f) tmp <- clusterEvalQ(cls, blank (g) ) fullmat <- NULL for (i in 1:ncls) fullmat <- blank (h) g <- list() g y <- seq(yb,yt,inc) g x) nrz <- nrow(g z <- g x <- g z) g <- list() g y <- seq(yb,yt,inc) g x) nrz <- nrow(g z <- g x <- g ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (60) The code below uses R Snow to implement a bucket sort similar to the OMP one in Sec. 1.4.2.6. See the comments at the beginning of the code. Fill in the blanks. # bucket sort with sampling; sort vector x # on cluster cls; data assumed to be fairly # uniformly distributed between a and b, # exclusive; return value is sorted x bsort <- function(cls,x,a,b) ncls <- length(cls) intwidth <- (b - a) / ncls # ship needed objects to workers clusterExport(cls, ________ // blank (a) envir=environment()) # have all workers set their ID clusterApply(cls, ________ ) // blank (b) # have all workers set their intervals clusterEvalQ(cls, ________ ) // blank (c) # sort locally at workers sortedchunks <- clusterEvalQ(cls, ________ ) // blank (d) # wrap up ________ // blank (e) setmyid <- function(i) myid <<- i setmyinterval <- function() mylow <<- a + (myid-1) * intwidth myhigh <<- a + myid * intwidth sortmine <- function() myx <- ________ // blank (f) sort(myx) 2. Fill in the blanks with terms from our course. [(a)] (10) The term used for a parallel application that presents no coding challenge, due to being easily parallelizable, it is called . [(b)] (10) When we are worried whether a certain parallel algorithm will work well on very large hardware (e.g. many cores), we ask whether it is . [(c)] (10) Associating each thread with a specific core is called . 3. (10) Consider a ring network. Here the nodes are arranged in a circle, with serial links connecting successive nodes. When a node receives a packet, it checks whether this node is the intended destination. If so, it accepts the packet, but if not, it forwards to the next node. Packets can be transmitted simultaneously on the various links. Packet motion is one direction, so counterclockwise. There is a processing delay at each node. Which is true of the following when an extra node is added? [(i)] Both latency and bandwidth will increase. [(ii)] Latency will increase but bandwidth will decrease. [(iii)] Latency will decrease but bandwidth will increase. [(iv)] Both latency and bandwidth will decrease. Solutions: 1. # bucket sort with sampling; sort vector x # on cluster cls; data assumed to be # fairly uniformly distributed between # a and b, exclusive; return value is sorted x bsort <- function(cls,x,a,b) ncls <- length(cls) intwidth <- (b - a) / ncls # ship needed objects to workers clusterExport(cls, c(\"x\",\"a\",\"b\",\"intwidth\", \"setmyid\",\"setmyinterval\",\"sortmine\"), envir=environment()) # have all workers set their ID clusterApply(cls,1:ncls,setmyid) # have all workers set their intervals clusterEvalQ(cls,setmyinterval()) # sort locally at workers sortedchunks <- clusterEvalQ(cls,sortmine()) # wrap up Reduce(c,sortedchunks) setmyid <- function(i) myid <<- i setmyinterval <- function() mylow <<- a + (myid-1) * intwidth myhigh <<- a + myid * intwidth sortmine <- function() myx <- x[x > mylow & x <= myhigh] sort(myx) 2a. embarrassingly parallel 2b. scalable 2c. processor affinity 3. (i) ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (100) Below is MPI code for transforming an adjacency matrix, as in Section 4.13. Fill in the blanks. // transforming an adjacency matrix, // MPI version #include int nwrkrs, // number of workers // number of vertices, assumed // divisible by # of workers nv, me, // my node number *adj, // the adjacency matrix *xmat, // transformed matrix finaloutrownum; // rows in xmat when done void init(int argc,char **argv) int i,j,tmp; nv = atoi(argv[1]); MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&tmp); nwrkrs = tmp - 1; MPI_Comm_rank(MPI_COMM_WORLD,&me); adj = malloc(_____); // blank (a) // as test, fill adj with random 0s,1s for (i = 0; i < nv; i++) for (j = 0; j < nv; j++) adj[i*nv+j] = rand() if (me == 0 && nv < 10) for (i = 0; i < nv; i++) for (j = 0; j < nv; j++) printf(\" printf(\"\"); void mgr() int i, chunksize = nv / nwrkrs, outrownum = 0, nrecv; MPI_Status status; xmat = malloc(_____); // blank (b) int maxrecv = _____ ; // blank (c) for (i = 1; i <= nwrkrs; i++) MPI_Recv(_____ , // blank (d) maxrecv,MPI_INT,i, MPI_ANY_TAG,MPI_COMM_WORLD,&status); MPI______( // blank (e) &status,MPI_INT,&nrecv); outrownum += _____ // blank (f) finaloutrownum = outrownum; void wrkr() int chunksize = nv / nwrkrs, outrownum = 0, mystartrow, myendrow,i,j; xmat = malloc(chunksize*nv*2*sizeof(int)); mystartrow = _____ // blank (g) myendrow = _____ // blank (h) for (i = mystartrow; i <= myendrow; i++) for (j = 0; j < nv; j++) if (adj[nv*i+j] == 1) xmat[2*outrownum] = i; xmat[2*outrownum+1] = j; __________ // blank (i) (one full C stmt) MPI_Send(xmat,_____, // blank (j) MPI_INT,0,0,MPI_COMM_WORLD); int main(int argc,char **argv) int i,j; init(argc,argv); if (me == 0) mgr(); else wrkr(); if (me == 0 && nv < 10) for (i = 0; i < finaloutrownum; i++) for (j = 0; j < 2; j++) printf(\" printf(\"\"); MPI_Finalize(); Solutions: 1. // transforming an adjacency matrix, // MPI version #include int nwrkrs, // number of workers // number of verts.; assumed div. by # of workers nv, me, // my node number *adj, // the adjacency matrix *xmat, // transformed matrix finaloutrownum; // rows in xmat when done void init(int argc,char **argv) int i,j,tmp; nv = atoi(argv[1]); MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&tmp); nwrkrs = tmp - 1; MPI_Comm_rank(MPI_COMM_WORLD,&me); adj = malloc(nv*nv*sizeof(int)); // as test, fill adj with random 0s,1s for (i = 0; i < nv; i++) for (j = 0; j < nv; j++) adj[i*nv+j] = rand() if (me == 0 && nv < 10) for (i = 0; i < nv; i++) for (j = 0; j < nv; j++) printf(\" printf(\"\"); void mgr() int i, chunksize = nv / nwrkrs, outrownum = 0, nrecv; MPI_Status status; xmat = malloc(nv*nv*2*sizeof(int)); int maxrecv = chunksize * nv * 2; for (i = 1; i <= nwrkrs; i++) MPI_Recv(xmat+outrownum*2,maxrecv,MPI_INT,i, MPI_ANY_TAG,MPI_COMM_WORLD,&status); MPI_Get_count(&status,MPI_INT,&nrecv); outrownum += nrecv / 2; finaloutrownum = outrownum; void wrkr() int chunksize = nv / nwrkrs, outrownum = 0, mystartrow, myendrow,i,j; xmat = malloc(chunksize*nv*2*sizeof(int)); mystartrow = (me-1) * chunksize; myendrow = mystartrow + chunksize - 1; for (i = mystartrow; i <= myendrow; i++) for (j = 0; j < nv; j++) if (adj[nv*i+j] == 1) xmat[2*outrownum] = i; xmat[2*outrownum+1] = j; outrownum++; MPI_Send(xmat,outrownum*2,MPI_INT,0,0,MPI_COMM_WORLD); int main(int argc,char **argv) int i,j; init(argc,argv); if (me == 0) mgr(); else wrkr(); if (me == 0 && nv < 10) for (i = 0; i < finaloutrownum; i++) for (j = 0; j < 2; j++) printf(\" printf(\"\"); MPI_Finalize(); ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (90) The CUDA code below solves a problem similar to the root finding example in Section 4.11. It finds the root of a user-supplied function f(), which is increasing on (0,1) and has a root somewhere inside. The initial search interval is (0,1), but the interval gets smaller with each iteration. At any iteration, the current interval is divided in subintervals, with each thread handling one subinterval. Fill in the blanks. #include #include #include // f() defined on (0,1), strictly increasing, // with a root somewhere inside __device__ float f(float x) return x*x - 0.5; #define BLOCKSIZE 192 __global__ void check1tile(float *devab) int threadnum = blockIdx.x * BLOCKSIZE + threadIdx.x, totthreads = gridDim.x * BLOCKSIZE; float a = devab[0]; float b = devab[1]; float xinc = (b-a) / ______; // blank (a) float x = __________; // blank (b) if (f(x) < 0) if (f(x+xinc) > 0) __________; // blank (c) __________; // blank (d) int main(int argc, char **argv) int i; float hosab[2] = 0.0,1.0, *devab; int niters = atoi(argv[1]); int nblocks = atoi(argv[2]); int float2 = 2 * sizeof(float); cudaMalloc((void **)&devab,float2); __________________; // blank (e) for (i = 0; i < niters; i++) dim3 dimGrid(nblocks,1); dim3 dimBlock(BLOCKSIZE,1,1); check1tile<<>>(devab); ______________________; // blank (f) ______________________; // blank (g) for(int i = 0; i < 2; i++) printf(\" cudaFree(devab); 2. (10) As a measure of thread divergence, we might consider utilization, meaning the proportion of time threads are actively executing an instruction. As a toy example, say we have 2 blocks of 4 threads each, executing 10 instruction cycles. That's a possible 80 instruction executions. But suppose 1 of the threads is idle during 6 cycles and another is idle during 15 cycles. Then our utilization would be (80 - 6 - 15)/80 or about 86. Consider the code in Problem 1, slightly modified: s = f(x); t = f(x+xinc); if (s < 0) if (t > 0) __________; // blank (c) __________; // blank (d) Suppose for simplicity that each of the 4 lines starting with the first if compiles to one machine instruction. With the specific f() used above, find the approximate utilization for those 4 instructions during the first iteration. Your answer must be an R expression. Note: Your answer should not depend on the number of threads. Solutions: #include #include #include // f() defined on (0,1), strictly increasing, // with a root somewhere inside __device__ float f(float x) return x*x - 0.5; #define BLOCKSIZE 192 __global__ void check1tile(float *devab) int threadnum = blockIdx.x * BLOCKSIZE + threadIdx.x, totthreads = gridDim.x * BLOCKSIZE; float a = devab[0]; float b = devab[1]; float xinc = (b-a) / totthreads; float x = a + threadnum * xinc; if (f(x) < 0) if (f(x+xinc) > 0) devab[0] = x; devab[1] = x + xinc; int main(int argc, char **argv) int i; float hosab[2] = 0.0,1.0, *devab; int niters = atoi(argv[1]); int nblocks = atoi(argv[2]); int float2 = 2 * sizeof(float); cudaMalloc((void **)&devab,float2); cudaMemcpy(devab,hosab,float2,cudaMemcpyHostToDevice); for (i = 0; i < niters; i++) dim3 dimGrid(nblocks,1); dim3 dimBlock(BLOCKSIZE,1,1); check1tile<<>>(devab); cudaThreadSynchronize(); cudaMemcpy(hosab,devab,float2,cudaMemcpyDeviceToHost); for(int i = 0; i < 2; i++) printf(\" cudaFree(devab); 2. All threads will execute the first if. The threads having a subinterval to the left of will also execute the second if. With threads, that means that there will be about instructions executions during opportunities, for a utilization of (1+2*sqrt(0.5))/4. ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (15) The online help for the clusterApply() function in R's parallel package says, 'clusterApplyLB' is a load balancing version of 'clusterApply'. If the length 'p' of 'seq' is not greater than the number of nodes 'n', then a job is sent to 'p' nodes. Otherwise the first 'n' jobs are placed in order on the 'n' nodes. When the first job completes, the next job is placed on the node that has become free; this continues until all jobs are complete. Using 'clusterApplyLB' can result in better cluster utilization than using 'clusterApply', but increased communication can reduce performance. Furthermore, the node that executes a particular job is non-deterministic. Fill in the blanks: This is similar to the option in programming, with chunk size . 2. (65) Here you will work on a Thrust version of the CUDA code in our last quiz, which solved a problem similar to the root finding example in Section 4.11. It finds the root of a user-supplied function f(), which is increasing on (0,1) and has a root somewhere inside. The initial search interval is (0,1), but the interval gets smaller with each iteration. At any iteration, the current interval is divided in subintervals, with each thread handling one subinterval. Fill in the blanks. // Thrust example: find the root of an // increasing function on (0,1); not // assumed efficient #include #include #include #include __host__ __device__ float f(float x) return x*x - 0.5; struct signchange float width; thrust::device_vector::iterator ab; signchange( ________, // blank (a) float _width): ab(_dab),width(_width) __host__ __device__ bool operator()(int i) if (_________) // blank (b) return true; else return false; ; // do niters iterations, with nsubintervals // checked each time; typically would want // nsubintervals = number of threads float throot(int niters, int nsubintervals) int iter; thrust::host_vector hab(2); hab[0] = 0.0; hab[1] = 1.0; float width; // subinterval width thrust::device_vector dab(hab); thrust::host_vector hfoundit(1); thrust::device_vector dfoundit(1); thrust::device_vector seq(nsubintervals); thrust::sequence(seq.begin(),seq.end(),0); for (iter = 0; iter < niters; iter++) width = (hab[1] - hab[0]) / nsubintervals; __________( // blank (c) ____________ // blank(d), contains // .begin(), .end() ________________ // blank (e) signchange(dab.begin(),width)); thrust::copy(dfoundit.begin(), dfoundit.end(),hfoundit.begin()); hab[0] = _____________ // blank (f) hab[1] = hab[0] + width; thrust::copy(hab.begin(),hab.end(), dab.begin()); return _________; // blank (g) // test case int main(int argc, char **argv) float root; int niters = atoi(argv[1]), nsubintervals = atoi(argv[2]); root = throot(niters,nsubintervals); printf(\" 3. Suppose we wish to use Thrust to compress an upper-triangular matrix, storing only the upper-triangular portion, column by column. For instance, the matrix would be stored as (5,12,168,13,8,1). [(a)] (10) Which would be appropriate here, a Thrust scatter or gather operation? [(b)] (10) For a input matrix, what would be the appropriate map vector, given your answer in (a)? Assume row-major order. Answer in vector form, e.g. (8,88,-2,-6). Solutions: 1. dynamic; OpenMP; 1 2. // Thrust example: find the root of an increasing function on (0,1) #include #include #include #include __host__ __device__ float f(float x) return x*x - 0.5; struct signchange float width; thrust::device_vector::iterator ab; signchange(thrust::device_vector::iterator _dab, float _width): ab(_dab),width(_width) __host__ __device__ bool operator()(int i) if (f(ab[0]+i*width) < 0 && f(ab[0]+(i+1)*width) > 0) return true; else return false; ; // do niters iterations, with nsubintervals checked each time; typically // would want nsubintervals = number of threads float throot(int niters, int nsubintervals) int iter; thrust::host_vector hab(2); hab[0] = 0.0; hab[1] = 1.0; float width; // subinterval width thrust::device_vector dab(hab); // index of subinterval where sign change is found thrust::host_vector hfoundit(1); thrust::device_vector dfoundit(1); thrust::device_vector seq(nsubintervals); thrust::sequence(seq.begin(),seq.end(),0); for (iter = 0; iter < niters; iter++) width = (hab[1] - hab[0]) / nsubintervals; thrust::copy_if(seq.begin(),seq.end(), dfoundit.begin(), signchange(dab.begin(),width)); thrust::copy(dfoundit.begin(),dfoundit.end(), hfoundit.begin()); hab[0] = hab[0] + hfoundit[0] * width; hab[1] = hab[0] + width; thrust::copy(hab.begin(),hab.end(),dab.begin()); return hab[0]; // test case int main(int argc, char **argv) float root; int niters = atoi(argv[1]), nsubintervals = atoi(argv[2]); root = throot(niters,nsubintervals); printf(\" 3a. gather 3b. (0,1,5,2,6,10,3,7,11,15) ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= GROUP QUIZ SUBMISSIONS INSTRUCTIONS: Your work must be submitted by 6 p.m, March 12. Submission must be done from within the classroom. Submit your work on handin, to the directory 158quiz7. Your .tar file name must conform to the rules explained in our Syllabus, Section 19.4. Your .tar file must comprise one file, named GrpQuiz.R. My grading script will be source(\"GrpQuiz.R\") # set cls, f, nsubints, niters (not shown) findroots(cls,f,nsubints,niters) 1. This problem will be similar to the root-finding examples we've seen. Here we are given a function f(), known to have one or more roots in (0,1), which you will write R Snow code to find. The call form will be findroots(cls,f,nsubints,niters) with arguments as follows: cls: The Snow cluster. f: The function whose roots are to be found. nsubints: Number of subintervals in each iteration, to be explained below. niters: Number of iterations. The return value is the vector of roots found, to the accuracy of the current subinterval width. The function f() is assumed to be continuous. If you wish to impose any additional restrictions on it, consult with me. It is not known how many roots it has. At each iteration, each of the Current Intervals will be divided into nsubints subintervals, each one of which will be checked for a sign change. The Current Interval starts as (0,1), but at any later iteration, there may be multiple Current Intervals waiting to be checked. You are welcome to download code from the Web, as long as it doesn't requite compilation. The preceding sentence should not be construed to mean that a Web search necessarily be helpful. Extra Credit will be given for the three fastest versions of the code. Note that in order to test your code's speed, you'll need a function f() that takes a nontrivial amount of time to evaluate, especially over many calls. ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (20) Fill in the blank (your answer should have the word and in it): According to class discussion, in developing a parallel program, the hardest sections to write are . 2. (20) Suppose we have a symmetric matrix , written in partitioned form where ' indicates transpose, and , the number of rows of is half the number of rows of . We have a column vector with the number of elements in being . We wish to compute the quadratic form by exploiting the partitioning (probably in parallel, but not relevant here). Show the algebraically simplified form of . Note: In your electronic file, write as A1, and so on. 3. (50) Here we will store many long arrays in one big array. We will store array in row of the big array. Anticipating having a great many large arrays, we will use OpenMP to build our big array. For convenience here, assume the number of arrays will be a multiple of the number of threads. Our function is #include void fillimage(float **arrs, int r, int narr, float *a) blank (a) int arr; float *arrstart; int me = omp_get_thread_num(); int nth = omp_get_num_threads(); int block = narr / nth; for (arr = blank (b) ) arrstart = blank (c) memcpy( blank (d)); Here arrs is the input arrays, each of length r, with there being narr arrays in all. The big array to be filled is a. Here is a test example: int main() float x[4] = 1,2,3,4, y[4] = 5,6,7,8; float *xy[2] = x,y; float z[8]; int i; fillimage(xy,4,2,z); // results should be 1,2,...,8 for (i = 0; i < 8; i++) printf(\" printf(\"\"); Fill in the blanks. 4. (10) In our NMF tutorial, the approximating matrix can actually turn out to be of rank larger than the targeted value . Explain why. Remember, you are limited to a single line, though it can be rather long. Solutions: 1. The start and finish. 2. Note the fact from linear algebra (and our book's review) that . 3. #include void fillimage(float **arrs, int r, int narr, float *a) #pragma omp parallel int arr; float *arrstart; int me = omp_get_thread_num(); int nth = omp_get_num_threads(); int block = narr / nth; for (arr = me*block; arr < (me+1)*block; arr++) arrstart = arrs[arr]; memcpy(a+r*arr,arrstart, r*sizeof(float)); 4. Since pixel brightness is in [0,1], we truncate values greater than 1. This perturbs some of the data. So, even though we have set things up so that no linear combination of more than colums of the matrix can be nonzero, that property will be ruined. It doesn't change the effectiveness of the operation, though. Note by the way that rank(AB) min(rank(a),rank(B), and that W and H have ranks at most k at any iteration, due to number of columns/rows. ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (10) Fill in the blanks: We are considering using MPI or snow on a certain app, with 8 workers. For MPI, the number of sockets used at each worker will be [ blank (a) ] while for snow it will be [ blank (b) ]. 2. Suppose we have partitioned matrices and , written in partitioned form: Here each of the submatrices in and , including the one drawn as ``,0'' are . [(a)] (20) Let , with Express in terms of the submatrices in and . (In your electronic submission, use ``A1'' for etc. Use ' for transpose, and use juxtaposition for multiplication. [(b)] (20) Assume all the are invertible (you may not need all of them to be so), with . will then exist, and we'll write the inverse as Show in terms of the and . Again, write as ``V1,'' etc. 3. (15) Consider the matrix-vector multiply snow code on p.22. If we were to convert it to a matrix-matrix multiply, everything in mmul() would pretty much carry over, except for the call to Reduce() in line 10. Show what the new contents of that line would be. 4. Consider the MPI code on pp.19ff. Suppose N is 10. (Note: The fact that this latter point is stated outside of (a) and (b) implies that it applies to both parts. Please keep this in mind in future quizzes.) [(a)] (20) How many times will Node 1 call MPISend()? [(b)] (15) Suppose in line 31 we were to accidentally write 1 instead of 0. What would happen? Choose one of the following: [(i)] One of the nodes will have an execution error, e.g. divide by 0. [(ii)] Two of the nodes will have an execution error. [(iii)] Three of the nodes will have an execution error. [(iv)] One of the nodes will hang. [(v)] Two of the nodes will hang. [(vi)] Three of the nodes will hang. [(vii)] No node will have an execution error or hang, but the printed answer will be incorrect. [(viii)] The correct answer will be printed out. Solutions: 1.a 8 (7 worker connections, 1 manager) 1.b 1 manager connection 2.a 2.b We have Thus , so . We also have . Solving for , we get 3. Reduce(rbind,mout) 4.a Node 0 will check values, 3, 5, 7 and 9 for divisibility by 3, with 5 and 7 surviving to go on to Node 1. The latter will take the 5 as Divisor, and then check 7 for divisibility by 5. The 7 survives this check and goes on to Node 2. That accounts for one call by Node 1 to MPISend(). Later it will make another such call, for ENDMSG, thus a total of 2 calls. 4.b The PIPEMSG messages become ENDMSGs. Node 0 will send three of them to Node 1. The latter will use the first to set Divisor, and upon receiving the second, will send Dummy to Node 2. That node will use that message to set StartDivisor, then wait for a message to use for ToCheck --- but that message will never come. So, the answer is (iv). ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. Consider the code in Sec. 1.5.2, which finds the maximum burst in a time series. [(a)] (10) What will be printed out when test() is called? [(b)] (10) What OpenMP construct is similar to the if statement prior to setting rslts? 2. The R parallel package includes a function clusterApplyLB(). It is similar to clusterApply(), but, in the docmentation's words, ...the first 'n' jobs are placed in order on the 'n' nodes. When the first job completes, the next job is placed on the node that has become free; this continues until all jobs are complete. [(a)] (10) What does the abbreviation ``LB'' likely stand for? [(b)] (10) What OpenMP keyword indicates an approach similar to this? 3. (30) The (out)-degree of a vertex i in a graph is the number of outlinks for that vertex. The OMP code below finds the degrees for the adjacency matrix a, returning them in dgs (preallocated by the caller). The number of vertices in the graph is n. Fill in the blanks. void finddgs(int *a, int n, int *dgs) blank (a) int i,j,dg; blank (b) for (i = 0; i < n; i++) dg = 0; for (j = 0; i < n; j++) blank (c) dgs[i] = dg; 4. (30) In an interpreted language such as R or Python, the best speed often depends on making good use of built-in functions, such as matrix multiplication explore an approach to speedy computation of the mutual outlinks application in Sec. 2.4.3. (A vertex might link to itself, but this will be irrelevant). We won't parallelize the operation here, but we could. Fill in the blanks in the function meanmo() below. Note that blank (a) must be filled in with the name of a built-in R function, and that you should find the following R code helpful > a <- rbind(1:2,3:4) > a [,1] [,2] [1,] 1 2 [2,] 3 4 > row(a) [,1] [,2] [1,] 1 1 [2,] 2 2 > row(a) == 1 [,1] [,2] [1,] TRUE TRUE [2,] FALSE FALSE > sum(row(a) == 1) [1] 2 > sum(row(a) > col(a)) [1] 1 > a[row(a) == col(a)] <- 8 > a [,1] [,2] [1,] 8 2 [2,] 3 8 Here is the function: meanmo <- function(a) aap <- a nummos <- blank (b) n <- nrow(a) nummos / blank (c) Solutions: 1.a 9, 16 1.b single 2.a load balancing 2.b dynamic 3. void finddgs(int *a, int n, int *dgs) #pragma omp parallel int i,j,dg; #pragma omp for for (i = 0; i < n; i++) dg = 0; for (j = 0; j < n; j++) dg += a[i*n+j]; dgs[i] = dg; 4. meanmo <- function(a) aap <- a nummos <- sum(aap[row(aap) < col(aap)]) n <- nrow(a) nummos / (n*(n-1)/2) find degrees at each node; matrix output; OMP alt version of clusterApply() is clusterLBApply(); mgr doles out chunks one at a time; each time worker done with task, mgr gives new one (a) \"LB\" likely stands for ________ (b) this takes a(an) ___________ approach to task sched ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (15) Suppose we are running a matrix application in C on a single-core system. We might be doing various operations with a matrix , such as calculating row and column sums, products of with vectors (say, both pre- and post-multiplying), and so on. Let denote the number of rows and columns of the matrix. Of course, the elements of are individual variables, and thus they may be subject to false sharing problems. Which of the following is true? [(i)] With larger values of , we probably won't have false sharing problems. [(ii)] With smaller values of , we probably won't have false sharing problems. [(iii)] The value of is irrelevant. [(iv)] False sharing is not an issue if we have just one core. 2. (30) Consider the Quicksort example using the OpenMP task facility, Sec. 4.5.1. As we know, the smaller the granularity in parallel computation, the worse the adverse impact of overhead. So, here we might add another argument to qs(), named k. If a task is given a chunk (not in the first call) that is of size smaller than k, this thread will NOT create new tasks. In addition to adding the argument k to the declaration of qs() and to calls to that function, two lines of the original code must be changed. For each one, state the line number and what the new contents of that line will be. NOTE CAREFULLY: In your electronic file, treat the two changes as part (a) and part (b) of this problem. Also, you may assume that any nonparallel function used earlier in the book is available to you, callable without function itself being in your code. Sample answer: change line 88 to if (!me) you 3. Consider the example on finding the maximal burst in a time series, Sec. 4.14. Suppose we wish to find the maximal sum rather than the maximal mean. So, in line 51 sum() will be called instead of mean(). (Note: The can be negative.) For convenience, we will in general make minimal changes to the code, for example using the same variable names. [(a)] (10) Give the names of any variables that will no longer be needed. If there are none, just write None. [(b)] (15) There is actually just one other line that needs to be changed. State which one, and how it should change. 4. (30) The Rdsm package has a barrier facility. Below is the code, with some blanks. Fill them. > barr function () realrdsmlock(brlock) count <- barrnumleft[1] sense <- barrsense[1] if (count == 1) barrnumleft[1] <- blank (a) barrsense[1] <- blank (b) realrdsmunlock(brlock) return() else barrnumleft[1] <- barrnumleft[1] - 1 realrdsmunlock(brlock) repeat if (barrsense[1] != sense) blank (c) Solutions: 1. (iv) With multiple cores, a write to a variable by one core may unnecessarily invalidate other variables in the same cache line at other cores. With a single core, there is no such problem. 2. void swap(int *yi, int *yj) int tmp = *yi; *yi = *yj; *yj = tmp; int separate(int *x, int low, int high) int i,pivot,last; pivot = x[low]; // would be better to take, e.g., median of 1st 3 elts swap(x+low,x+high); last = low; for (i = low; i < high; i++) if (x[i] <= pivot) swap(x+last,x+i); last += 1; swap(x+last,x+high); return last; int cmpints(int *u, int *v) if (*u < *v) return -1; if (*u > *v) return 1; return 0; void qs(int *z, int zstart, int zend, int firstcall, int k) #pragma omp parallel int part; if (firstcall == 1) #pragma omp single nowait qs(z,0,zend,0,k); else if (zstart + k < zend) part = separate(z,zstart,zend); #pragma omp task qs(z,zstart,part-1,0,k); #pragma omp task qs(z,part+1,zend,0,k); else qsort(z+zstart,zend-zstart+1,sizeof(int),cmpints); 3.a Line 33. 3.b Line 55: xbar = xbar + x[perend]; 4. > barr function () realrdsmlock(brlock) count <- barrnumleft[1] sense <- barrsense[1] if (count == 1) barrnumleft[1] <- myinfo ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (10) Fill in the blank with a term from our course: Compared to OpenMP, code written in CUDA tends to have smaller/finer . 2. (20) Consider the CUDA example of transforming an adjacency matrix in Sec. 5.12, which we will compare to the similar OMP example in Sec. 4.13. In the latter, consider line 54. State which line (specify the line number) in the CUDA version this corresponds to, or if there is none, why none is needed. 3. (20) Consider the CUBLAS example, Sec. 5.17.1.1. Suppose we simply want to compute the product of the top row of our matrix with the specified vector. Show how to change line 33 to accomplish this. 4. Consider the Mutual Outlinks example, Sec. 5.8. There each thread does a lot of work, but here we'll change it so that each thread will handle exactly one row of the adjacency matrix. Line 51 will replace the 192 by tperb (``threads per block''), taken from the command line. [(a)] (20) What specific restriction must we impose on the user in terms of the grid and block sizes? [(b)] (15) State which two lines must be changed in the kernel, and show what they should be changed to. [(c)] (15) When I first compiled the (unchanged) program, I got an error message, ``atomicAdd undefined'' (but no other errors). What likely error did I make? Solutions: 1. granularity 2. No corresponding line. The second kernel call depends on results from the first, and CUDA will notice that will require a wait. 3. cublasSgemv('n',1,n,1.0,dm,n,dm,n,0.0,drs,1); 4.a Must have n = nblk * tperb, so don't have more or fewer threads than matrix rows. 4.b Change line 14 to i = me; Delete line 19. 4.c Forgot to include -arch=s11 in the compile line. This is needed because atomicAdd() is a function usable only on models 1.1 and above. ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (20) The NVIDIA GPU execution model is Single Instruction Multiple Threads, SIMT. What is the acronym for MPI? 2. (20) The process ID in MPI has a 2-level hierarchical structure, rather analogous to blockIdx and threadIdx in CUDA. What are the official MPI terms for the two levels? (Answer on one line.) 3. (20) Consider the run-length decoding examples in Secs. 10.5 and 10.6, the first in OMP and the second in Thrust. State the line number in Sec. 10.6 that is analogous to line 10 in Sec. 10.5. If there is no such line in 10.6, answer None. 4. (20) In the example titled, Removing 0s from an Array, Sec. 8.4, consider what may happen in terms of the order of the nonzero values within the no0s array. Which one of the following is true? [(i)] The order will be the same as in the has0s array. For instance, if 12 and 5 were in has0s, with 12 in a lower-index position than 5, then the index of 12 in no0s will be lower than that of 5. [(ii)] The order will be reversed. [(iii)] The order will be random, in the sense that in several runs of the program, different orders may occur in different runs. [(iv)] The order will be nonrandom, but neither (i) nor (ii) will necessarily occur. 5. (20) In this problem you will develop an OpenMP function that works like thrust::copyif(). Fill in the blanks: #include #include int u[8] = 1,0,1,1,0,1,0,1, v[8] = 1,5,6,3,0,2,0,8, w[8]; int f(int a) return a != 0; void omp_copy_if(int *x1, int *x2, int nin, int *y, int *nout, int (*boolftn)(int)) blank (a) #pragma omp parallel int i; #pragma omp for for (i = 0; i < nin; i++) if ( blank (b) blank (c) ntrue++; blank (d) #pragma omp single *nout = ntrue; main(int argc, char **argv) int i,no; omp_copy_if(v,u,8,w,&no,f); // should print 1 6 3 2 8 for (i = 0; i < no; i++) printf(\" Solutions: 1. SPMD 2. communicator, rank 3. None; Thrust does not work directly with thread number. 4. (i) 5. #include #include int u[8] = 1,0,1,1,0,1,0,1, v[8] = 1,5,6,3,0,2,0,8, w[8]; int f(int a) return a != 0; void omp_copy_if(int *x1, int *x2, int nin, int *y, int *nout, int (*bool )(int)) int ntrue = 0; #pragma omp parallel int i; #pragma omp for for (i = 0; i < nin; i++) if (boolftn(x2[i])) #pragma omp atomic ntrue++; y[ntrue-1] = x1[i]; #pragma omp single *nout = ntrue; main(int argc, char **argv) int i,no; omp_copy_if(v,u,8,w,&no,f); // should print 1 6 3 2 8 for (i = 0; i < no; i++) printf(\" ","course":"ECS158"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: 1. This problem concerns the CUDA code for Gaussian elimination, Sec. 11.5.2. Assume that the code that calls the kernel will have quantities , and at the beginning of Sec. 11.5 stored in the variables a, b and n, respectively. The array a is one-dimensional, length . We have another array ab, one-dimensional, length , corresponding to the argument of the same name in the kernel. [(a)] (20) Fill in the blank in the following statement: dim3 dimBlock( blank ,1,1); [(b)] (20) The code preparing ab will include the following, in which you will fill in the blank: for (j = 0; j < n; j++) ab[ blank ] = b[j]; 2. (20) Consider applying the smoothing idea, Sec. 13.5.1, to audio, in the time domain. We could adapt the code in Sec. 4.14 for this. The argument k now will be the number of neighbors to smooth with, using k/2 data points before and after the given point. We would delete much of the code. In particular, replace line 48 by perlen = k; deleting line 62. We would have to add a crucial line. State the line number after which the new statement would be added, and state what single line should be added; don't worry about ``corner cases,'' say what happens near the ends of the array. 3. (40) Use ``Snow '' (the portion of the R library parallel that was derived from the old snow library) to implement the run-length coding decompression algorithm in Sections 10.5 and 10.6. The ``declaration'' of your function will be decomp <- function(x,cls) where x is the compressed vector, and cls is a Snow cluster. Your code need not be optimal, just parallel and correct. Submit just the function itself in the end, but you may wish to temporarily put in a test case so you can try your code through OMSI. Solutions: 1.a n 1.b (j+1) * (n+1) 2. After line 56, insert x[perstart + k/2] = xbar; 3. Outline: split x into chunks for the workers each worker does a straightforward decompression of its chunk apply Reduc(c, ) to what is returned from the workers ","course":"ECS158"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. SHOW YOUR WORK! 1. (20) Just above (10.39), p.318, supposed it said, ``The nodes collectively will repeatedly cycle through idle and busy periods, termed I and B periods,'' instead of defining I and B in terms of just node 0. Here we are in an I period if no nodes are active, and in a B period if at least one node is active. How would (10.39) change? 2. (20) Exercise 20, Chapter 5, p.155. 3. Consider the board game in Sec. 2.8, pp.14-15. This can be modeled as a Markov chain, with state space 0,1,2,...,7. [(a)] (10) Find . [(b)] (10) Find the long-run fraction of turns in which you get a bonus roll, expressed as a function of . Note: If you roll and hit 3, and then roll a second time for the bonus, that still only counts one turn, not two. 4. The moment generating function of the random pair (X,Y) is defined by [(a)] (10) For the density (5.17), p.105, find . Express your answer in integral form. [(b)] (10) For a general random pair (X,Y), express Cov(X,Y) in terms of moment generating functions. 5. (20) Consider a Markov chain . Let have the distribution (an example is discussed in Sec. 10.1.2.4). Show that ``given the present, the past and the future are independent,'' in the sense that for , and are independent, given . Solutions: 1. Again I will have a geometric distribution. However, the ``success probability'' (the parameter p in (3.66)) changes. ``Success'' here will be that at least one of the n nodes becomes active, i.e. generates a message to send. This occurs with probability So, 2. From the Law of Total Variance, Since the conditional distribution of N given L is Poisson with parameter L, . The result follows. 3.a We can go from square 6 to square 7 either directly, by rolling a 1, or via a bonus, by rolling a 5 to get to 3, then rolling a 4 to get to 7. The probability of this is 3.b We can only get a bonus from squares 0 (by rolling a 3), 1 (by rolling a 2), 2, 5, 6 and 7. It's impossible from square 4, and we are never on square 3 anyway. So, 4.a Use (5.15): 4.b For example, to get E(XY), note that Thus 5. We must show that But the left-hand side of () is while the two factors in the right-hand side are and Multiplying () by (), we find that the product matches ()! So we're done. ","course":"ECS256"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. SHOW YOUR WORK! 1. (20) We wish to have additional output from the DES code MM1.R: Among all the times that the server finishes a job, what proportion have the property that the server immediately starts another job? Add code to MM1.R to achieve this. Your answer must be in the form, ``Between lines 22 and 23, add the following code...'' 2. Consider the file backup storage example on p.356, but with . Express your answers as common fractions, reduced to the lowest terms. [(a)] (25) Say we look at the end of track n, for n large. Find the probability that the current file has occupied all or part of exactly three tracks so far. [(b)] (25) Find the probability that the first two tracks (0 and 1) have at most two files in it. (In any case, the last file in the track will necessarily be partial.) 3. (30) We will draw a sample of size 2 from a population that consists of k subgroups. Our sampling procedure is to choose a person at random from the entire population, and then to choose our second person from the same subgroup that the first person belongs to. (The sampling is done with replacement.) The variable of interest, X, has mean and variance in subgroup i, i = 1,...,k. A proportion of the population consists of subgroup i, i = 1,...,k, with . Find the variance of the sample mean, , in terms of these quantitites. Your answer will use the symbol, but should be reasonably concise for full credit. Solutions: 1. The main point is to add code somewhere between lines 53 and 57 to increment the proper count. It also must be initialized and printed out. 2a. We need P(A(n) 2.0). So, evaluate the long-run age density from our book, and then integrate from 2 to 3. 2b. Let and be the lengths of the two files. We need . Drawing a picture, we see that the easiest way to evaluate this is to find and then subtract from 1.0. That probability is equal to 3. Use the Law of Total Expectation. where G is the group number. From our usual properties of in random samples, we have So, Also Finally, add. ","course":"ECS256"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= 1. This problem concerns the Trivedi cell phone service model, pp.202ff in our book. Of course, use the same notation that is used in the book. [(a)] (20) Give the balance equation for state 0, i.e. (11.5) for the case . [(b)] (20) As we progress through time, the cell will alternate between what we will call AllCallsBlocked and NewCallsOK periods. Find the mean length of AllCallsBlocked periods. 2. (20) Suppose the -element vector has mean vector and covariance matrix . Define the scalar to be the difference between the first element of and the average of the remaining elements (``difference'' meaning the former minus the latter). Find , expressed only in terms of , , and . You may use ellipsis (``...'') notation, but may NOT refer to individual elements of or . 3. (20) In this problem you will write an R function mcsim(p,nsteps) whose purpose it is to find the approximate vector for a discrete-time Markov chain via simulation. It would be used mainly for infinite-state space settings. Here are the details: The argument p is a user-supplied function that specifies the transition matrix; e.g. p(2,5) returns the probability of going from state 2 to state 5 (in one step). The argument nsteps is the number of time steps to be simulated, i.e. . The return value will be the approximate vector, expressed as an R list, with a component for each nonzero value of our approximate . For instance, if we name our list pi and find that and are approximately 0.8 and 0.2, then we will set pi[[2]] and pi[[4]] to 0.8 and 0.2. Use the following method. First, use simulation to determine whether our new state is 1 or not. If not, then simulate to determine whether our new state is 2 or not, and so on. If you are not very familiar with R lists, try typing this code into R's interactive mode: z <- list() z[[3]] <- 8 z i <- 6 z[[i]] <- 8888 z Solutions: 1.a 1.b (It was explained during the quiz that AllCallsBlocked periods consist of times during which NO call not already in service will be accepted, no matter whether handoff call or new call originating within the cell.) Picture what happens when we enter such a period. There had been one free channel, but now none is free. The period will end when some call currently in service ends, which could occur either by someone hanging up or by a caller leaving the cell. Those events are occurring for each caller at rate , for a total rate of . Moreover, the time passing before such a transition has an exponential distribution (due to the fact that a minimum of independent exponentially-distributed random variables itself has an exponential distribution). In other words, the mean length of the period will be . 2. Define a -component random vector Then Thus by (12.54). 3. # find the approximate stationary distribution pi for a discrete-time # Markov chain, using simulation; intended mainly for infinite state # spaces # states are 1,2,3,... # arguments: # p(i,j): user-defined, returns element (i,j) of the transition # matrix # nsteps: number of time steps to simulate # value: # an R list, one element for each nonzero element of pi mcsim <- function(p,nsteps) # record number of visits to each state found so far visits <- list() currstate <- 1 # arbitrary starting state visits[[currstate]] <- 1 for (i in 1:nsteps) # get next state currstate <- simnextj(currstate,p) # R lists are touchy about adding new elements, so updating visits # list is delicate if (currstate > length(visits)) visits[[currstate]] <- 1 else if (is.null(visits[[currstate]])) visits[[currstate]] <- 1 else visits[[currstate]] <- visits[[currstate]] + 1 # sim done, change to proportions for (i in 1:length((visits))) tmp <- visits[[i]] if (!is.null(tmp)) visits[[i]] <- visits[[i]] / nsteps visits # simulate next state, given currently at i simnextj <- function(i,p) j <- 1 tot <- 0 repeat pij <- p(i,j) # must use CONDITIONAL probability if (pij > 0 && runif(1) <= pij / (1-tot)) return(j) tot <- tot + pij if (tot >= 1) return(j) j <- j + 1 # test example: keep flipping coin; each time get 3 consecutive heads, # win prize; state i means i-1 consecutive heads so far (should be 0,1,2 # but to have indices start at 1, have 1,2,3) consec3 <- function(i,j) if (i==1 && j == 1 i==1 && j == 2 i==2 && j == 3 i==2 && j == 1) return(0.5) if (i==3 && j == 1) return(1) return(0) # test with mcsim(consec3,1000) ","course":"ECS256"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= DIRECTIONS: Write your solutions in a single .tex file, including R code. Your .tar package will consist of that file, its output .pdf file, and a separate file for each problem requiring R code, each file named in the form x.R, where x is the problem number. Name your .tar file as you did in the homework, but with your own address only. Your submission must be in the 256quiz2 directory in handin, timestamped on or before 3:00 p.m. NO LATE SUBMISSIONS; keep submitting the work you have, as you go along, so that you at least have something turned in. You are not necessarily expected to solve all the problems. 1. (35) Consider the ``Wharton experiment,'' p.427 of our book. Write a function with ``declaration'' wharton <- function(indata,sg,nnoise) that performs this experiment on any data frame indata. Here nnoise is the number of noise variables to be added, where is sg. It is assumed that the response variable is the one in the final, i.e. rightmost, column of indata. There will be no return value. Instead, your function will make a call print(summary(lm())) to run the regression and print out the results, replete with asterisks. (You may not get a lot of them.) You may wish to convert indata to a matrix within your code. Place your code in 1.R, and a listing in your .tex file. 2. (35) Suppose and that has a uniform distribution on (0,1). Find . For full credit, have no explicit integrals. 3. (30) Suppose has support (0,1), with density This is a 1-parameter density family, with the parameter . Note that is a constant to make the function integrate to 1; it is not a second parameter. Write a function with ``declaration'' gmmq <- function(x,initq) which uses the gmm package (this is required) to estimate from the data vector x and an initial guess initq. The return value of the function will be the object that gmm() returns. You'll need to use the recipe given to Nick by the author of gmm(). Note: You must include your mathematical derivation in your .tex file, and your R code in both that file and 3.R. Solutions: 1. wh <- function(xy,sg,nnoise) n <- nrow(xy) p <- ncol(xy) xy <- cbind(xy,matrix(sg*rnorm(n*nnoise),ncol=nnoise)) xy <- as.matrix(xy) print(summary(lm(xy[,p] xy[,-p]))) 2. 3. gmmq <- function(x,initq) if (is.vector(x)) x <- matrix(x,ncol=1) g <- function(th,x) q <- th[1] c <- 1/(2-q) c*q^2/2 + c * (1-q^2) - x gmm(g,x,c(q = initq),lower=0,upper=1, method='Brent') library(freqparcoord) data(mlb) xy <- mlb[,c(4,6,5)] sim <- function(n) tmp <- runif(n,0,0.5) ifelse(sample(0:1,n,prob=c(0.3333,0.6667),replace=T), tmp+0.5,tmp) gmmq <- function(x,initq) if (is.vector(x)) x <- matrix(x,ncol=1) g <- function(th,x) q <- th[1] c <- 1/(2-q) c*q^2/2 + c * (1-q^2) - x gmm(g,x,c(q = initq),lower=0,upper=1, method='Brent') ","course":"ECS256"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (10) Give a single assembly language instruction equivalent to [fontsize=-2] popl popl popl assuming that we do not care what the popped values actually are. 2. (10) When the call scanf(\"ddd\",u,v,w) is compiled, how many push operations will appear before the CALL instruction? 3. (10) Say you are running some program on CSIF that makes use of a special library in your own home directory, say /home/thisisme/. What command should you run to enable the OS to find that library when you execute the program? 4. (10) List the Intel-specific registers (using their official Intel names) whose values are affected when a RET instruction executes. 5. This problem concerns the code in pp.137-139. Suppose we were to change things so that addone() would have (as viewed as a function callable from C) the signature [fontsize=-2] int addone(int x) as opposed to what was in the version in the book, [fontsize=-2] void addone(int *x) The function will now return the value of its argument plus 1. [(a)] (30) Fill in the gap in the revised version of addone(): [fontsize=-2] .text .globl addone addone: # insert at most 4 instructions here ret [(b)] (30) Suppose the call in TryAddOne.c is now wrapped inside a print call: [fontsize=-2] printf(\" I ran the new code through gcc -S, an excerpt of which appears below. Fill in the gaps. [fontsize=-2] movl\t .LC0, ( call\tprintf Solutions: 1. addl 12, esp 2. 4 3 [fontsize=-2] setenv LD_LIBRARY_PATH /home/thisisme 4. ESP, EIP 5.a [fontsize=-2] .text .globl addone addone: movl 4( incl ret 5.b [fontsize=-2] movl .LC0, ( call printf ","course":"ECS50"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. Consider the GDB output at the top of p.149. Answer the following questions about line 15: [(a)] (10) Which of the four numbers is an address? [(b)] (10) What kind of address is your answer in (a)? (i) Physical address. (ii) Virtual address. (iii) Page number. (iv) Offset. (v) Stack position. (vi) I/O port number. (vii) None of these. 2. (20) Suppose we are running a program on CSIF. It seems slow to us, and we suspect that this may be due to excessive cache misses or page faults. Fill in the blanks: Using material from our course, we can determine the number of using the command, but we cannot determine the number of . Of these two numbers the one that causes more slowdown is . 3. (10) Consider the code on p.231. What is the slot number for z? 4. (25) In the example on p.199, give a numerical expression for the offset-within-page of q[0]. 5. (25) I changed the function Min() on p.225 to: [fontsize=-2] public static int gy(int U, int V) int T; _______________________________________________________________ _______________________________________________________________ _______________________________________________________________ This produced the code [fontsize=-2] public static int gy(int, int); Code: 0:\tiload_0 1:\ticonst_3 2:\tiadd 3:\tiload_1 4:\tif_icmpge\t14 7:\tiload_0 8:\ticonst_3 9:\tiadd 10:\tistore_2 11:\tgoto\t18 14:\tiload_1 15:\ticonst_3 16:\tiadd 17:\tistore_2 18:\tiload_2 19:\tireturn Fill in the blanks above. Solutions: 1a. Not graded. 1b. (ii) 2. page faults, time, cache misses, page faults 3. 3 4. 0x7bf-4-1999 4 5. [fontsize=-2] if (U+3 < V) T = U+3; else T = V+3; return T; ","course":"ECS50"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (20) Below is an excerpt from a conversation on the Web, written by a developer of some commercial software product: For a reasonably large project, the procedure symbols add over 1MB of data to the final executable. Not only that, but one does not necessarily want customers to see all the functions and files called... In the context of our course, how would one avoid this? 2. In the ``toy'' example on p.13, suppose we devote 9 bits to the mantissa and 7 bits to the exponent. [(a)] (20) What would be the representation of 1.25? Answer in hex. [(b)] (20) What is the largest positive number that can be stored? Express your answer in the form , with and in base 10, in simple form. 3. (40) Consider the code in the middle of p.19. What would the following print out? [fontsize=-2] printf(\" printf(\" Solutions: 1. The writer is referring to retaining the symbol table during compilation. In our context, this would be remedied by NOT using the -g option in GCC. 2.a The 5 would now be represented as 000000101 and the -2 as 1111110, so the full representation would be 0000001011111110 or 0x02fe. 2.b The mantissa would have to be as large as possible, which would be 011111111, which is . Similarly, set the exponent to 0111111, which is . So, the largest possible storable number would be . 3. [fontsize=-2] 64636261 1 ","course":"ECS50"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Note: There is an ASCII table on p.37. 1. Suppose our machine has 5-bit word size, with 2s complement storge. [(a)] (10) Using 0s and 1s, show the representations of +5 and -5. [(b)] (20) What are the largest representable positive number and most negative (i.e. largest in absolute value) negative number? [(c)] (20) Suppose x and y are declared as int on this machine. Give an example of positive values of x and y for which x+y is negative. 2. (10) Consider the code [fontsize=-2] char c = '+'; Give the hex form of the byte at which c is stored. 3. (20) Consider the code at the top of p.20. If I were read in as 22, would any element of X be affected? If so, state which one; if not, state why not. 4. (20) Consider the code [fontsize=-2] int x; strncpy(&x,\"88\",2); printf(\" Say this is run on a 16-bit machine. State what value will be printed out. Your answer must be in the form of a numerical expression, e.g. . Solutions: 1. 00101, 11011; 15, -16; 8+8 is -16, for example 2. 0x2b 3. X[2] 4. '8' has ASCIII code 0x38, so the contents of x will be 0x3838, and the value printed out will be . ","course":"ECS50"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (20) Consider the output of the code on p.29. Suppose we had a 64-bit system. Assuming the address of X did not change, what would be the address of Y? 2. (20) Consider the declaration [fontsize=-2] char q[8][3]; If q[0][2] happens to be stored at address 0x20c, at what address will q[2][1] be stored? 3. (15) Consider the code [fontsize=-2] unsigned int *f,*g; char *s1 = \"gr88\", *s2 = \"bb99\"; f = s1; g = s2; printf(\" Fill in the blank: The number printed out will be negative if and only if the machine . 4. The function print5() below prints out an unsigned int in base-5. So, [fontsize=-2] unsigned int x = 39; print5(x); would output 124 (actually 0124, since leading 0s aren't suppressed here). Assume we have 8-bit words. [(a)] (30) Fill in the blanks in the code: [fontsize=-2] int print5(unsigned int x) unsigned int i,d, power5=_____________________, r=x; for (i = 0; i < 4; i++) d = r / power5; r = r printf(________________________________); power5 /= 5; printf(\"\"); [(b)] (15) What impact, if any, did the assumption of an 8-bit word size have on the code? Solutions: 1. 0xbffffb7c 2 0x211 3. is little-endian 4a [fontsize=-2] int print5(int x) int i,d, power5=125, r=x; for (i = 0; i < 4; i++) d = r / power5; r = r printf(\" power5 /= 5; printf(\"\"); 4b the initial value of power5 was 125 ","course":"ECS50"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (10) Fill the blank: The sign (+, -, 0) of the result of executing a CMPL instruction is recorded in the . 2. Answer the following questions as to what occurs during the execution (not the fetch) of the instruction [fontsize=-2] addl [(a)] (10) How many times will a number be placed onto the address bus? [(b)] (10) Which control lines will be used? 3. Consider the example of counting lower-case letters, pp.93ff, but modified so that line 2 is [fontsize=-2] x: .string \"c29jem\" [(a)] (15) For each of the line numbers and operands listed below, state what addressing mode is being used. rrr line & operand & addressing mode 11 & & 16 & & 16 & & [(b)] (30) Suppose x is at address 0x500c, and consider the situation that will exist when we reach done. Show (in hex) the contents at each of these addresses: rr address & contents 0x500c & 0x500d & 0x500e & 0x500f & 0x5010 & 0x5011 & 0x5012 & 0x5013 & 0x5014 & 0x5015 & 0x5016 & 0x5017 & 0x5018 & 4. (25) Consider the code at the top of p.93, at the line [fontsize=-2] andl x+ & immediate 16 & & indirect 16 & & register 3.b rr address & contents 0x500c & 63 0x500d & 32 0x500e & 39 0x500f & 6a 0x5010 & 65 0x5011 & 6d 0x5012 & 0 0x5013 & 0 0x5014 & 0 0x5015 & 1 0x5016 & 0 0x5017 & 1 0x5018 & 0 4. First, -16 is -0x10, i.e. 0xfffffff0. That last 0 is four 0 bits, while the fs are each 1111. So, the mask will set the last four bits of ESP to 0s, while leaving the other bits intact. So, the 0x88888168 in ESP will change to 0x88888160. ","course":"ECS50"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (10) Fill in the blank: In addition to using a stack for subroutines, the Intel hardware also uses a stack for arithmetic. 2. Look at the assembler output on p.113. [(a)] (15) Suppose the instruction jz done will placed between lines 29 and 30. What would the 75F8 for jnz top change to? [(b)] (15) Suppose after linking, it has been decided that the .data section will begin at 0x00052000. Then what will change, if anything, in lines 24-34, and what will be the new value there if there is a change? 3. (60) The following code goes through an array that is initially pointed to by EAX, and searches in the array for the value in EBX. The array is terminated by a 0. The result will be placed into EDX---either the index at which the value was found, or -1 if it was not found. For example, if the array is (1,5,2,13,0) and the value to be searched for is 13, then 3 will be placed into EDX. A search for 5 will result in a 1 in EDX. If the value to be searched for is 88, then -1 will be placed into EDX. Fill in the blanks: [fontsize=-2] movl top: movl ( # put 1 instruction here jz foundit # put 1 instruction here jz notthere # put 1 instruction here jmp top notthere: # put 1 instruction here jmp done foundit: subl # put 1 instruction here movl done: addl 0, jz notthere addl -1, jmp done foundit: subl shrl ","course":"ECS50"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (20) The code below subtracts 1 from EAX if the contents of that register are negative, but leaves EAX unchanged otherwise. Fill in the blanks: [fontsize=-2] subl 0, js done decl done: ... 2a. 35 2b. 2 2c. second 2d. JS looks at Bit 7 of EFLAGS; the latter has value 0x287, which is 11 1000 0111, so Bit 7 had a 1, so we do jump ","course":"ECS50"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (20) Fill in the blank with a term from our course: Compilers for Intel machines typically have EBP play a major role in the maintenance of a linked list of . 2. (20) State all lines (if any) on p.151 in which the linker is invoked. 3. (20) A ``mask'' is used in Chapter 7 in a couple of places. Show one such instruction in that chapter. 4. (20) The C library contains a function with the following signature: [fontsize=-2] char *strncpy(char *s1, char *s2, int n); It copies the first n characters at s2 to s1. Suppose we wish to call this function in assembly language, copying 8 characters from the array pointed to by EBX to the array in the .data section beginning at a word with the label w. Show assembly code to do this. 5. (20) Consider the code at p.153, bottom. Fill in the 2 blanks: If we insert [fontsize=-2] printf(\" between lines 3 and 4, it will print out the address of the assembly code compiled from line . add one to four lines of C code between lines 7 and 8 that will print out in hex the address of instruction following call g compiled from line 12. Solutions: 1. stack frames 2. 3 3. any of the AND instructions, e.g. in line 4, p.168 4. for example, [fontsize=-2] pushl w call strncpy 5. 2, 13 ","course":"ECS50"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (20) The statement [fontsize=-2] __asm__(\"addl ","course":"ECS50"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (30) Following is a table of analogies between the C/Intel/Linux world and the JVM world. Fill in the blanks! (The lengths of the blanks are not meaningful.) rr C/Intel/Linux & JVM & Integer.parseInt() & invokestatic, invokevirtual pushl 1 & .o or ELF & EBP & ESP & 2. (20) Consider the function gy() on p.235. Fill in the blanks: The reason that x is in slot 1 instead of slot is that the Java keyword is not present in line 1. 3. (50) Consider this Java code to find Fibonacci numbers. (These are 1,1,2,3,5,8,13,21,..., each one being the sum of the previous two.) [fontsize=-2] public class Fib public static void main(String[] clargs) int i,n,fib[]; n = Integer.parseInt(clargs[0]); fib = new int[n]; fib[0] = fib[1] = 1; genfibs(n,fib); for (i = 0; i < n; i++) System.out.println(fib[i]); public static int genfibs(int k, int fbs[]) int i; for (i = 2; i < k; i++) fbs[i] = fbs[i-1] + fbs[i-2]; return 0; Below is part of the output from running this through javap -c. Fill in the blanks. [fontsize=-2] public static void main(java.lang.String[]); Code: 0: aload_0 1: iconst_0 2: aaload 3: invokestatic #2; //Method java/lang/Integer.parseInt... 6: istore_2 7: iload_2 8: newarray int 10: // BLANK 11: aload_3 12: iconst_0 13: aload_3 14: iconst_1 15: iconst_1 16: dup_x2 17: iastore 18: iastore 19: // BLANK 20: aload_3 21: invokestatic #3; //Method genfibs:(I[I)I 24: pop 25: iconst_0 26: istore_1 27: iload_1 28: iload_2 29: if_icmpge 47 32: getstatic #4; //Field java/lang/System.out:Ljava/io/PrintStream; 35: aload_3 36: iload_1 37: iaload 38: invokevirtual #5; //Method java/io/PrintStream.println:(I)V 41: iinc 1, 1 44: goto 27 47: return public static int genfibs(int, int[]); Code: 0: iconst_2 1: istore_2 2: // BLANK 3: iload_0 4: if_icmpge // BLANK 7: aload_1 8: iload_2 9: aload_1 10: iload_2 11: iconst_1 12: isub 13: iaload 14: aload_1 15: iload_2 16: iconst_2 17: isub 18: iaload 19: iadd 20: iastore 21: iinc 2, 1 24: goto // BLANK 27: iconst_0 28: ireturn Solutions: 1. rr C/Intel/Linux & JVM atoi() & Integer.parseInt() CALL & invokestatic, invokevirtual pushl 1 & iconst0 .o or ELF & .java ESP & optop EBP & Frame Data 2. 0, static 3. Key points: fib is in slot 3, so result of newarray must go there for the call to genfibs(), we must push n and then (the address of) fibs onto the stack, and n is in slot 2 to execute ifcmpge we must first push its two operands, i and k the compare operation is there to test whether we've finished the loop, and if so, we jump down to the return 0, in offsets 27 and 28 in genfibs() similarly, offset 24 in genfibs() is the bottom of the loop; if we're not done, we jump to the top, offset 2 [fontsize=-2] Compiled from \"Fib.java\" public class Fib extends java.lang.Object public Fib(); Code: 0:\taload_0 1:\tinvokespecial\t#1; //Method java/lang/Object.\"\":()V 4:\treturn public static void main(java.lang.String[]); Code: 0:\taload_0 1:\ticonst_0 2:\taaload 3:\tinvokestatic\t#2; //Method java/lang/Integer.parseInt:(Ljava/lang/String;)I 6:\tistore_2 7:\tiload_2 8:\tnewarray int 10:\tastore_3 11:\taload_3 12:\ticonst_0 13:\taload_3 14:\ticonst_1 15:\ticonst_1 16:\tdup_x2 17:\tiastore 18:\tiastore 19:\tiload_2 20:\taload_3 21:\tinvokestatic\t#3; //Method genfibs:(I[I)I 24:\tpop 25:\ticonst_0 26:\tistore_1 27:\tiload_1 28:\tiload_2 29:\tif_icmpge\t47 32:\tgetstatic\t#4; //Field java/lang/System.out:Ljava/io/PrintStream; 35:\taload_3 36:\tiload_1 37:\tiaload 38:\tinvokevirtual\t#5; //Method java/io/PrintStream.println:(I)V 41:\tiinc\t1, 1 44:\tgoto\t27 47:\treturn public static int genfibs(int, int[]); Code: 0:\ticonst_2 1:\tistore_2 2:\tiload_2 3:\tiload_0 4:\tif_icmpge\t27 7:\taload_1 8:\tiload_2 9:\taload_1 10:\tiload_2 11:\ticonst_1 12:\tisub 13:\tiaload 14:\taload_1 15:\tiload_2 16:\ticonst_2 17:\tisub 18:\tiaload 19:\tiadd 20:\tiastore 21:\tiinc\t2, 1 24:\tgoto\t2 27:\ticonst_0 28:\tireturn ","course":"ECS50"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (20) Fill in the blanks: Consider two instructions, which we'll call i1 and i2, with i2 immediately following i1, and with i1 not being a jump of any kind. Then just before i1 is finished executing, the will contain the address of . Just after i1 finishes, that address will be copied to the bus. Assume no caches or instruction queues. 2. Consider the following code fragment: [fontsize=-2] ... jnz aplace movl 0x7fffffff, shll 2, addl ______________ ohhhhnoooo ... [(a)] (15) Suppose, both here and in subsequent parts, that the offset of the first movl, listed in the output of as -a, turns out to be 0028. At what offset will the second movl begin? [(b)] (15) In the output of as -a in assembling this code, what will be the machine language code generated for that second movl? [(c)] (20) What will be the machine language code generated for jnz aplace? [(d)] (15) In the instruction following the second addl, we'd like to jump to ohhhhnoooo if the last instruction produced a situation in which the sum of two positive numbers came out ``negative.'' List all possible instructions that we could put in the blank. [(e)] (15) Suppose in running this code under GDB, we issue the commands [fontsize=-2] (gdb) b aplace (gdb) run (gdb) p/x Say the output of the last command is 0x80400000. Give a numerical expression (hex numbers are OK), for the memory address of the beginning of the .text segment. Solutions: 1. PC; i2, address 2a. The instruction will assemble to 5 bytes, so the next offset will be 0x28 + 5 = 0x2d. 2b. bbffffff7f 2c. 7505 2d. js or jo 2e. 0x80400000 - (28+5) ","course":"ECS50"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. Consider the C and assembly code in Section 6.8.1, pp.134-136. [(a)] (20) Based on the information here, how many arguments does exit() have? (i) None. (ii) One. (iii) Two. (iv) Three. (v) It would have the number of arguments given in c(EAX). [(b)] (25) Suppose that in running this code under GDB, I set a breakpoint at line 35, p.136, and run the program. When it stopped at the breakpoint, I submitted these commands, with the results shown: [fontsize=-2] 35 pop (gdb) info registers eax 0xbfaaaad4 -1079334188 ecx 0xbfaaaa50 -1079334320 edx 0x1 1 ebx 0x804a020 134520864 esp 0xbfaaaa18 0xbfaaaa18 ebp 0xbfaaaa38 0xbfaaaa38 esi 0x8048460 134513760 edi 0x8048340 134513472 eip 0x8048443 0x8048443 eflags 0x200202 [ IF ID ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51 (gdb) x/5x 0xbfaaaa18 0xbfaaaa18: 0xb8074ff4 0x0804841b 0x0804a020 0x08049ff4 0xbfaaaa28: 0xbfaaaa48 State the addresses of x and the instruction on line 26, p.135. [(c)] (10) It would have been better to use ECX instead of EBX on p.136. Very briefly explain what advantage would accrue from using ECX here, citing a specific passage in the textbook. 2. (25) Suppose we wish to store data on a stack that grows away from 0. Thus we cannot use the pushl instruction, and of course subroutines calls will use the ``normal'' stack, not this one; we will just use this one to store data. Give two lines of assembly code that will push the number 88 onto this new stack. Assume that EDX will serve as our stack pointer. 3. (20) Suppose we're writing an assembly language program whose .data segment is [fontsize=-2] .data x: .long 0 fmt: .string \" We wish to read in the value of x from the keyboard, by calling the C library function scanf(). Give assembly code (no more than five lines) that will do this. Solutions: 1a. (ii) 1b. x is at 0x804a020, and the instruction is at 0x0804841b. 1c. Page 141, bottom says we are guaranteed there will be no ``live'' values in ECX. Thus we need not save it on the stack, and later pop it off, thus saving time. 2. [fontsize=-2] addl 88,( 3. [fontsize=-2] pushl fmt call scanf ","course":"ECS50"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (20) List the various arrays from our book that are created by the OS. In each case, state whether the array is accessed by the hardware. rr array name in book & hardware access? & & & & & & & 2. Running the Linux pstree command displays ``parents'' and ``children'' of processes. Suppose we run the command on CSIF, and notice that there is a gcc process running. [(a)] (15) The likely parent of that process is either or or or ... (Fill in as many command names as appropriate; answer ``none'' if there is likely no parent.) [(b)] (15) The likely child of that process is either or or or ... (Fill in as many command names as appropriate; answer ``none'' if there is likely no child.) 3 (20) Consider the 11-line excerpt of Linux internal code on p.186, at the instant just before line 12 is executed. Suppose at that time, c(ESP) = 0x8000. What will be in memory location 0x8000 at that time? 4. Consider the threads example that begins on p.190. [(a)] (15) For each of the following variables in the code, write either SAAT (``same address across threads'') or DAAT (``different address across threads''). rr kb & nthreads & id[0] & [(b)] (15) Suppose we run the program with the command [fontsize=-2] and we then type ps axH in another window. In the output of this latter command, we will likely see entries for hw, of which are in Run state, and of which are in Sleep state. Solutions: 1. rr array & hardware access? memory-allocation table & no process table & no page table & yes interrupt vector table & yes 2a. tcsh, make 2b. cpp, cc1, as, ld 3. PC value of the process that we are about to resume (called v in the text) 4a. rr kb & DAAT nthreads & SAAT id[0] & SAAT 4b. 4, 1, 3 ","course":"ECS50"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (15) A scheme under which an I/O device communicates with memory without the intervention of the CPU is called . 2. (15) Consider the example in Section 6.11. Fill in the blanks: By using a macro instead of a subroutine, we made the program by the amount of bytes, and we made its run time . (The first blank must be filled in by larger or smaller, the second by a number, and the third by faster or slower.) 3. Answer either HW (``hardware'') or SW (``software'') concerning what entity performs each of the following tasks: [(a)] (10) Saving c(EBX) within a device driver. [(b)] (10) Setting c(IDT). [(c)] (10) Setting c(c(IDT)+8). [(d)] (15) Saving the ``bread crumbs'' when an interrupt is detected. [(e)] (10) Acknowledging an interrupt. [(f)] (15) Evicting a cache block. Solutions: 1. DMA 2. larger; 6; faster 3. SW; SW; SW; HW; HW; HW ","course":"ECS50"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (20) Fill in the blank: The set of parallel wires connecting the CPU to memory is called a . 2. Say we have a 5-bit word size. In each part below, answer with two bit strings, separated by a space, such as 01010 11110 [(a)] (10) Find the signed-magnitude and 2s complement representations (in that order) of 5. [(b)] (10) Find the signed-magnitude and 2s complement representations (in that order) of -5. 3. (20) What is printed out? int x[3][4] 5,3,8,2,1,8,3,7,88,168,8,8888; printf(\" 4. (20) Suppose we were to build a base-4 machine, i.e. 4 voltage levels coding the numbers 0,1,2,3, with word size of 3 base-4 digits. The decimal number 51, for instance, would be stored as 303, since 303 means . The machine will use using ``4s complement'' storage for signed integers, the base-4 analog of 2s complement. Show the coding for -5. 5. (20) Say we have a disk with a rotation speed of 9600 revolutions per minute, 1000 tracks, a seek speed of seconds per track, and a sector size of 528 bytes. Give the time in seconds for a read request to start, measured from the start of the seek and the time the first byte is read. Give your answer as an R expression. Assume that at the time the seek begins: the read/write head is at the innermost track; the desired sector is in the middle track; and the start of the sector is a half revolution from the read/write head when the seek completes. Solutions: 1. bus 2a. 00101 00101 2b. 10101 11011 3. 168 4. To find -5, ``wind backwards'' from 000 five times, yielding 333, 332, 331, 330, 323. So 323 is the representation of -5. 5. seek time: 0.5 * 1000 * 0.000001 rotation time before read first byte: 0.5 * 1/(9600/60) total: 0.5 * 1000 * 0.000001 + 0.5 * 1/(9600/60 ","course":"ECS50"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (20) Fill in the blank: A is located either in or near the CPU, and contains a copy of part of memory. 2. (20) What will be printed out? char z = 'L'; printf(\" 3. (20) Suppose we are a 32-bit Intel machine. Say z is declared as int, z = 200 and z contains . State the contents (as a base-10 number) of Byte 202. 4. The function below finds the sum of the elements in column colnum of a two-dimensional array x having nrow rows and ncol columns. int colsum(int *x, int colnum, int nrow, int ncol) int sum = 0, m, *p; p = ______________________; // (a) for (m = 0; m < nrow; m++) sum += *p; p += _______________________; // (b) return sum; State what code should go in the blanks: [(a)] (10) Blank (a). [(b)] (10) Blank (b). 5. (20) Consider a machine with 24-bit words and addresses. We will be storing numbers in the range 0,1,...,15; I'll call such numbers glonks. We want to store as many glonks as possible, so of course we will store multiple glonks per word. How many can we store in all of memory? Your answer must consist of an R expression. Assume that our machine has as much memory as possible, and that the operating system etc. take up only negligible space. Solutions: 1. cache 2. ']' 3. c(z) = 0x123456, and we are on a little-endian machine, so 0x56 is in Byte 200, 0x34 in Byte 201 and 0x12 = 18 is in Byte 202. 4.a x + colnum 4.b ncol 5. 2 * 2^24 ","course":"ECS50"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (20) Consider the program on pp.64-65. Suppose we wish to sum only the even-numbered indexed elements of the array, treating the word in line 6 as element 0, the one in line 7 as element 1 and so on. We'll need to change two lines in the code. State the line numbers and the changes. Your answer MUST have the form, change line 28 to \"incl line 168 to \"addl 2, 24: addl ","course":"ECS50"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (20) Fill in the blanks: The analog of a compiler at the assembly language level is called a . In our class, we refer to it metaphorically as a human . 2. As you know, when writing functions/subroutines, we may originally write one for use in one program but then find we can use it---unchanged---in another program. Suppose that is the case for the subroutine findmin in Section 3.6, pp.72ff; in other words, we simply copy lines 54-95 to the source file of another program, and use the subroutine there. Say the array we'll be using it on begins at a label z and has 12 elements. The relevant section of code will be movl ________________________ # blank (a) ________________________ # blank (b) call findmin movl z, 2.b movl ","course":"ECS50"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. In all Test problems, assume 32-bit words unless stated otherwise. 1. (20) Fill in the blanks: During the execution of the jump instruction in line 24, p.94, the jump will be done if the Flag and the Flag bits satisfy certain conditions. 2. Consider the following code, which multiples each element in an array of 4 words by 28. It is assumed that x is a label in the .data segment (not shown), that all elements of x are considered unsigned, and that no product will need more than 32-bits. .text .globl _start _start: blank (a) (entire line) movl 4, top: movl x( imul blank (b) movl decl jz done blank (c) (entire line) blank (d) (entire line) done: movl 0, movl 4, top: movl x( imul movl decl jz done addl 0, 2.e Even though we know the product is unsigned and will fit in EAX, IMUL won't know that, and will sign-extend in EDX. That will wipe out our value of expressions like x( first element of x will be changed, and the final values in that array will be 140, 5, 1, 8. 3. 16, 30 CODE TO MULTIPLY X BY 28: PART (E) OF THE MUL28 PROBLEM: 56, 5, 1, 8 (EDX gets wiped out by IMUL) CODE TO MULTIPLY 8 BY 9: .text .globl _start _start: movl x, movl shll 13, ","course":"ECS50"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. On all Tests, 32-bit word size on Intel machines running Linux is assumed unless otherwise stated. 1. (20) A C's return is translated to a certain machine/asembly language instruction. What is its name? 2. (25) Suppose we are debugging the code on pp.64-65. Then names such as sum and top will be available to us from our debugging tool if we had used the option at the time we assembled the program. 3. (25) Consider the function int f(n) int k; k = n * f(n-1); return k; Suppose at runtime the operating system has allocated 600 words for our stack, and that we do not have write permission for the first word below (i.e. at a smaller address) the stack space. Say the stack is empty, and we make the call f(100). Then we will get a seg fault on the (or or or ) call to f(); fill in the blank, using an R expression as your answer. 4. (30) Suppose several local variables in a C source file are declared this way: int x = 5; static y, z = 12; // equiv. to static int y,z=12; Then probably: [(a)] (10) The variable x will be stored in . [(b)] (10) The variable y will be stored in . [(c)] (10) The variable z will be stored in . Solutions: 1. ret 2. --gstabs 3. Each call expands the stack by 4 words (1 for argument, 1 for local, 1 for bread crumbs, 1 for saved EBP), so 150 calls will fill the stack, and the 151st will cause a seg fault. 4.a in the stack 4.b in a .comm segment 4.c in a .data segment C return translated to ret return names like sum, top: --gstabs recursion: each call expands the stack by 4 words (1 for argument, 1 for local, 1 for bread crumbs, 1 for saved EBP), so 150 calls will fill the stack, and the 151st will cause a seg fault storage of locals: x in the stack, y in the .comm section, z in the .data section ","course":"ECS50"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. On all Tests, 32-bit word size on Intel machines running Linux is assumed unless otherwise stated. 1. (15) The name for the Intel stack pointer register is . 2. (20) Suppose we wish to call scanf() from x.s. Then instead of running as, it's more convenient to run because the latter will automatically link in the C . 3. (20) Suppose there is a certain C language function, f(), with type int, i.e. it has an integer return value. Suppose the compiler produces code in which the return value is held in ECX during intermediate computation. Fill in the blank in the code below, which shows what the compiler produces near the end of the function. movl __________, __________ ret 4. I ran the code in pp.137-139 with GDB. Here is part of my session: Breakpoint 1, addone () at AddOne.s:25 25 push (gdb) n 30 movl 8( (gdb) n 32 incl ( (gdb) info registers eax 0xbffff7e4 -1073743900 ecx 0xcbe07bea -874480662 edx 0x1 1 ebx 0x804a020 134520864 esp 0xbffff718 0xbffff718 ebp 0xbffff738 0xbffff738 esi 0x0 0 edi 0x0 0 eip 0x8048439 0x8048439 eflags 0x200282 [ SF IF ID ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51 [(a)] (15) At what address is the return address ( the ``bread crumbs'') stored? [(b)] (15) At what address is x (in the main program) stored? [(c)] (15) During the execution (not fetch) of the incl instruction, what value(s) will be placed on the data bus? Solutions: name of stack ptr: ESP calling scanf(): gcc, library ECX: movl GDB session: 0xbffff71c, 0x804a020, 7, 8 ","course":"ECS50"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. In all Test problems, assume 32-bit words unless stated otherwise. 1. (20) Consider the code .data .rept 2 .string \"abc\" .endr At what offset will the second 'c' be stored? 2. Consider the following code, which multiplies a word labeled x in the .data segment by 9. .text .globl _start _start: movl blank (a), movl blank (b), shll blank (c), addl blank (d) movl [(a)] (10) State what goes in blank (a). [(b)] (10) State what goes in blank (b). [(c)] (15) State what goes in blank (c). [(d)] (15) State what goes in blank (d). [(e)] (10) State what goes in blank (e). 3. (20) Give a single assembly instruction that places 1s in bit positions 0, 1 and 4 of EAX. CODE TO MULTIPLY X BY 28: .text .globl _start _start: movl 0, movl 4, top: movl x( imul movl decl jz done addl 0, PART (E) OF THE MUL28 PROBLEM: 56, 5, 1, 8 (EDX gets wiped out by IMUL) CODE TO MULTIPLY 8 BY 9: .text .globl _start _start: movl x, movl shll 13, ","course":"ECS50"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. On all Tests, 32-bit word size on Intel machines running Linux is assumed unless otherwise stated. 1. (20) The reason that the CPU, memory and I/O devices are able to communicate with each other is that they are all connected to the . 2. (20) Threads communicate with each other via variables. 3. (15) Consider the ISR in pp.172-173. Suppose we wish to give this device the highest priority, in the sense of not allowing the ISR to be interrupted. Then we would insert a(n) instruction after line . 4. Here is a portion of the file PrimeThreads.s, the assembly language code produced by running gcc -S on the primes counter, pp.201-203: movl ","course":"ECS50"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. On all Tests, 32-bit word size on Intel machines running Linux is assumed unless otherwise stated. 1. (25) When an interrupt occurs, there will be a slight delay before the currently-running program is suspended, because the circuitry does not check for interrupts until after Step . 2. (25) During bootup, the OS places the addresses of the device drivers into an entity known as the , and points to that entity. 3. (25) Consider the keyboard device driver, pp.168-169. Suppose we wish to determine whether this is a key press or a key release. Show the instruction we'd put at done to start determining this. (NOT a MOV instruction.) 4. (25) Consider the primes counter, pp.201-203. Most of the values of work printed out will be 0 if is . (Place a program variable in the first blank, and either ``large'' or ``small'' in the second. Just give answers that work, not all possible answers.) Solutions: 1. C 2. interrupt vector table; IDT 3. cmpb ","course":"ECS50"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. On all Tests, 32-bit word size on Intel machines running Linux is assumed unless otherwise stated. 1. (15) The execution (not fetch) of an IRET instruction makes (fill the first blank with a number, the second with more or fewer, and the third with read or write accesses to memory than does an RET. Assume no cache. 2. (15) Page replacement policy is set by (i) hardware; (ii) system software; (iii) the application programmer; (iv) a combination of (i) and (ii); (v) a combination of (i) and (iii); (vi) a combination of (ii) and (iii); (vii) a combination of (i), (ii) and (iii). 3. Consider the code int main() int m; scanf(\" m++; printf(\" Running this through gcc -S yields: I've removed some extraneous material. [numbers=left] .LC0: .string \" .LC1: .string \" .text main: pushl blank(a) andl 32, blank(b) 28( movl blank(c) call scanf movl 28( addl .LC1, ( call printf blank(e) ret [(a)] (15) Fill blank (a). [(b)] (10) Fill blank (b). [(c)] (10) Fill blank (c). [(d)] (10) Fill blank (c). [(e)] (10) Fill blank (e). [(f)] (15) Give the line numbers in the assembly code which could cause a page fault when the instruction is executed (not fetched). Do NOT include any lines containing a blank, and do NOT include the even-numbered lines. Solutions: 1. 2 more reads 2, (ii) 3.a-e .LC0: .string \" .LC1: .string \" .text main: pushl movl andl 32, leal 28( movl movl 1, movl movl 28( movl movl ","course":"ECS50"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. On all Tests, 32-bit word size on Intel machines running Linux is assumed unless otherwise stated. 1. (15) Say we have a multicore machine. Among the registers EAX, EBP, EFLAGS, ESP, ESI, IDT and PTR, state which must have separate versions in each core. For example, if you state EAX has such a property, that means that each core must have its own separate EAX register, rather than one EAX register serving all cores. Hint: There will be at least one register in the list with this property. 2. (20) How many times will the hardware consult the page table (assume no TLB or other cache) during the execution (not decode) of each of the following instructions: [(a)] () incl [(b)] () addl ( [(c)] () movl x, [(d)] () movl 16, \tmovl\t 0, -4( \tjmp\t.L2 .L3: \tmovl\t12( \timull\tblank(a)( \taddl\t20( \tsall\t 16, \tmovl\t 0, -4( \tjmp\t.L2 .L3: \tmovl\t12( \timull\t-4( \taddl\t20( \tsall\t 1, -4( .L2: \tmovl\t-4( \tcmpl\t12( \tjl\t.L3 \tmovl\t-8( \tleave \tret ","course":"ECS50"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. In this problem you will enhance the textfile class on p.22. First, you will add a member variable tfiles, a list of pointers to all the files for which textfile instances currently exist. Second, you will add method named cat(), which has just a single argument, whose name is outflname. This function will concatenate all the files in tfiles, outputting the result to a new file whose name is given by outflname. Use the open-for-writing form of open(), which just involves adding 'w' as a second argument, and writelines(), which works as the opposite of readlines() except that now there is an argument, the outfile name. You should also use the close() method for files. You can read examples on p.52 if you wish, but it's not necessary, as all the information is above. If for example file a consists of abc de f and file b consists of 8 168 then the concatenated file contents are abc de f 8 168 PLEASE WRITE YOUR SOLUTION AS FOLLOWS: Simply write the new lines that must be added; don't copy down the entire existing textfile class code. So, write something like, ``In between lines 5 and 6, insert the following code...'' 2. Consider the unit square S in the plane, with lower-left corner at (0,0) and upper-right corner at (1,1). We are interested in distances from points in this square to (1,0). There also is a smaller rectangle R, of width 2w and height h, with lower left point (0.5-w,0) to and upper-right point (0.5+w,h) (sides parallel to the outer square). We are interested in the minimum travel distance to (1,0) for each point in S that is not in R, under the constraint that travel is not allowed within R. Note (see the function d() below) that we are using ``Manhattan street distance,'' which means paths consist only of vertical and horizontal segments. Say for instance w = 0.25 and h = 0.50, and we are considering the point (0.20,0.10). The shortest path to (1,0) consists first of going to (0.25,0.50), then along the top of R, and then to (1,0), for a total distance of 0.05 + 0.40 + 0.50 + 0.50 + 0.25. We set up an nxn grid of points within S [(0,0) through , and for each one wish to compute the length of the shortest path to (1,0). For points in R, we define this distance to be -1.0. The function getdists(w,h,n) below returns the distances in a list of lists (i.e. two-dimensional ``array''). Fill in the details. import math def d(x,y,x1,y1): return abs(x1-x) + abs(y1-y) # returns the minimum distance # from (x,y) to (1,0) (or returns -1.0) def calcdistto10(x,y,w,h): # insert 1 or more lines here # ... def getdists(w,h,n): # insert 1 or more lines here # ... return dists IMPORTANT NOTE: Don't worry whether boundary lines of R count as part of R or not. Solutions: 1. [numbers=left] class textfile: ntfiles = 0 # count of number of textfile objects fls = [] def __init__(self,fname): textfile.ntfiles += 1 textfile.fls.append(self) self.name = fname # name self.fh = open(fname) # handle for the file self.lines = self.fh.readlines() self.nlines = len(self.lines) # number of lines self.nwords = 0 # number of words self.wordcount() def wordcount(self): \"finds the number of words in the file\" self.nwords = reduce(lambda x,y: x+y,map(lambda line: len(line.split()),self.lines)) def grep(self,target): \"prints out all lines containing target\" lines = filter(lambda line: line.find(target) >= 0,self.lines) print lines def cat(outflname): ofl = open(outflname,'w') lns = [] for fl in textfile.fls: lns += fl.lines ofl.writelines(lns) ofl.close() cat = staticmethod(cat) 2. def d(x,y,x1,y1): return abs(x1-x) + abs(y1-y) def calcdistto10(x,y,w,h): if x > 0.5 - w and x < 0.5 + w and y < h: return -1.0 if x < 0.5 - w and y < h: return d(x,y,0.5-w,h) + 2*w + h + (0.5-w) return d(x,y,1,0) def getdists(w,h,n): dists = [] for i in range(n): rowofdists = [] for j in range(n): tmp = calcdistto10(float(i)/n,float(j)/n,w,h) rowofdists.append(tmp) dists.append(rowofdists) return dists ","course":"ECS145"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. The function wrmat() inputs a matrix and a file name, and outputs the matrix to a text file. For example, the call wrmat([ [1,2,3], [5,12,13] ],'outmat.txt') would produce the text file outmat.txt with contents 1 2 3 5 12 13 Fill in the blanks. def wrmat(mat,tf): f = open(tf,'w') for row in mat: # insert 1 or more lines f.write(outrow+' f.close 2. The function primefact() below finds the prime factorization of number, relative to the given primes. For example, the call primefact([2,3,5,7,11],24) would return [ 2,3], [3,1] ], meaning that . (It is assumed that the prime factorization of n does indeed exist for the numbers in primes.) Fill in the blanks. # find the maximal power of p # that evenly divides m def dividetomax(p,m): k = 0 while True: if m k += 1 m /= p def primefact(primes,n): tmp = map( # blank tmp = filter( # blank return tmp Solutions: 1. def wrmat(mat,tf): f = open(tf,'w') for row in mat: outrow = '' for elt in row: outrow += str(elt) + ' ' f.write(outrow+' f.close 2. # find the maximal power of p # that evenly divides m def dividetomax(p,m): k = 0 while True: if m k += 1 m /= p def primefact(primes,n): tmp = map(dividetomax,primes,len(primes)*[n]) tmp = filter(lambda u: u[1] > 0,tmp) return tmp ","course":"ECS145"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. () The function findfile() searches for a file (which could be a directory) in the specified directory tree, returning the full path name of the first instance of the file found with the specified name, or returning None if not found. For instance, suppose we have the directory tree /a shown on pp.51-52, except that /b contains a file z. Then the code print findfile('/a','y') print findfile('/a','b') print findfile('/a','u') print findfile('/a','z') print findfile('/a/b','z') produces the output /a/y /a/b None /a/b/z /a/b/z Fill in the blanks: import # blank def findfile(treeroot,flname): os.chdir(treeroot) currfls = os.listdir('.') for fl in currfls: if fl == flname: # blank for fl in currfls: # blank; insert <= 5 lines of code, # possibly including with lesser indentation Solutions: 1. import os # returns full path name of flname in the tree rooted at treeroot; # returns None if not found; directories do count as finding the file def findfile(treeroot,flname): os.chdir(treeroot) currfls = os.listdir('.') for fl in currfls: if fl == flname: return os.path.abspath(fl) for fl in currfls: if os.path.isdir(fl): tmp = findfile(fl,flname) if not tmp == None: return tmp return None def main(): print findfile('/a','y') print findfile('/a','u') print findfile('/a','z') print findfile('/a/b','z') if __name__ == '__main__': main() ","course":"ECS145"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (10) Below is a generator version of the circular queue example on p.85, plus a test program. Fill in the blanks: def cq(q): while True: head = q[0] # one blank line # one blank line def main(): x = [5,12,13] g = cq(x) print g.next() # prints 5 print g.next() # prints 12 print g.next() # prints 13 print g.next() # prints 5 print g.next() # prints 12 2. (10) Below is a function to find all subsets of size k from a set of size n. Here's a test: def subsets(n,k): # remaining code... def main(): n = int(sys.argv[1]) k = int(sys.argv[2]) g = subsets(n,k) for sub in g: print sub [0, 1] [0, 2] [0, 3] [0, 4] [1, 2] [1, 3] [1, 4] [2, 3] [2, 4] [3, 4] Fill in the blanks: def subsets(n,k): if k == 0: yield # blank # blank for i in range(n-k+1): # find all subsets beginning with i g = # blank for sub in g: yield #blank Solutions: 1. [numbers=left] def cq(q): while True: head = q[0] yield head q = q[1:] +[head] def main(): x = [5,12,13] g = cq(x) print g.next() print g.next() print g.next() print g.next() print g.next() if __name__ == '__main__': main() 2. [numbers=left] import sys def subsets(n,k): if k == 0: yield [] raise StopIteration for i in range(n-k+1): # find all the subsets beginning with i g = subsets(n-i-1,k-1) for sub in g: yield [i] + map(lambda u:u+i+1,sub) def main(): n = int(sys.argv[1]) k = int(sys.argv[2]) g = subsets(n,k) for sub in g: print sub if __name__ == '__main__': main() ","course":"ECS145"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (25) Consider the grading program example in Section 1.9. Suppose an instructor perversely sets a rule that the highest few quiz grades would be dropped. State a single line to be changed in the existing code, and state what the new line would be. For example, if you think that Line 32 should be changed to x = 88, write your answer as 32 x = 88 2. Consider each part of this problem as a separate, independent but complete Python interactive session (minus output and error messages, if any). For each part, answer ``error'' or ``no error'', respectively, depending on whether the an error message occurs from execution of the code. [(a)] (5) >>> x = [[1,2,3],[5,12,13],[6,7]] [(b)] (5) >>> sqrt(9) [(c)] (5) >>> def sq(x): ... return x*x ... >>> y = sq >>> y = y(3) [(d)] (5) >>> class x: ... def __init__(self): ... self.y = 8 ... >>> a = x() >>> a.z = 88 [(e)] (5) >>> x = (1,2,3) >>> x.append(4) 3. (25) The function places() below returns a Python list of the instances of the character c in the string s. For example, >>> y 'abcdefaba' >>> places(y,'a') [0, 6, 8] Fill in the blanks. def places(s,c): base = _____________________ insts = ___________________ while True: place = s[base:]_______________ if place == -1: break insts.append(_________________) base = __________________ return insts 4. (25) The built-in Python function map(f,sq) (in its basic form) calls the function f() on each element of the sequence sq, returning a list consisting of the results. The function extractcol(j,m) returns column j of the matrix m, where we are defining a matrix to be a two-dimensional array with the same number of elements in each row. For instance, >>> m [[2, 3], [7, 8], [0, 28]] >>> extractcol(1,m) [3, 8, 28] Fill in the blanks. def extractcol(j,m): def getelt(r): return __________________ return _____________________ Solutions: 1. 49 tmp = tmp[:(len(tmp) - ndrop)] 2. Parts (b) and (e) produce error messages. 3. def places(s,c): base = 0 insts = [] while True: place = s[base:].find(c) if place == -1: break insts.append(base + place) base = base + place + 1 return insts 4. def extractcol(j,m): def getelt(r): return r[j] return map(getelt,m) ","course":"ECS145"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (10) Below is a generator version of the circular queue example on p.85, plus a test program. Fill in the blanks: def cq(q): while True: head = q[0] # one blank line # one blank line def main(): x = [5,12,13] g = cq(x) print g.next() # prints 5 print g.next() # prints 12 print g.next() # prints 13 print g.next() # prints 5 print g.next() # prints 12 2. (10) Below is a function to find all subsets of size k from a set of size n. Here's a test: def subsets(n,k): # remaining code... def main(): n = int(sys.argv[1]) k = int(sys.argv[2]) g = subsets(n,k) for sub in g: print sub [0, 1] [0, 2] [0, 3] [0, 4] [1, 2] [1, 3] [1, 4] [2, 3] [2, 4] [3, 4] Fill in the blanks: def subsets(n,k): if k == 0: yield # blank # blank for i in range(n-k+1): # find all subsets beginning with i g = # blank for sub in g: yield #blank Solutions: 1. [numbers=left] def cq(q): while True: head = q[0] yield head q = q[1:] +[head] def main(): x = [5,12,13] g = cq(x) print g.next() print g.next() print g.next() print g.next() print g.next() if __name__ == '__main__': main() 2. [numbers=left] import sys def subsets(n,k): if k == 0: yield [] raise StopIteration for i in range(n-k+1): # find all the subsets beginning with i g = subsets(n-i-1,k-1) for sub in g: yield [i] + map(lambda u:u+i+1,sub) def main(): n = int(sys.argv[1]) k = int(sys.argv[2]) g = subsets(n,k) for sub in g: print sub if __name__ == '__main__': main() ","course":"ECS145"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (20) On p.105, it says, ``Since the simulated time variable Simulation.t is in a separate module...'' Suppose we've installed SimPy in /a/b/c/. What will be the full path name of the module referred to here? 2. (20) What kind of object is requred for the second argument to SimPy's activate() function? Your answer must be a single Python term. 3. This problem concerns the first example in our Oct. 24 handout titled 3des.pdf. [(a)] (20) The difference computed in line 39 is equal to the value of an expression computed in another line. State the number of the latter. [(b)] (20) Say we're interested in the mean idle period per machine. (Note that due to the symmetry of the situation, this will be the same for all machines.) Add code to compute this and print it out. Write your answer in the form, ``Between lines 40 and 41, insert this line...'' 4. (20) Consider the cell phone model in the handout in Problem 3. Suppose that 5 of all the calls made by those in cars passing through the cell ends before the car gets all the way through the cell. Among such calls, let Y denote the proportion of the trip through the cell during which the call is still active. For instance, Y = 0.88 means the call end when the car was 88 through the cell. Assume Y is uniformly distributed on (0,1), which by the way can be simulated by calling uniform(0,1) in Python's random library. Show how to alter the code to reflect this variation on the original model. As in Problem 3, express your answer in terms of what code you insert where, and in this case, also state the lines to be deleted, if any. Solutions: 1. /a/b/c/SimPy/Simulation.py 2. iterator 3.a 36 3.b between 21 and 22: TotIdle = 0.0 NIdle = 0 between 27 and 28: StartIdle = now() MachineClass.NIdle += 1 between 28 and 29: MachineClass.TotIdle += now() - StartIdle between 73 and 74: print MachineClass.TotIdle / MachineClass.NIdle 4. between 157 and 158: [numbers=left] if Globals.Rnd.uniform(0,1) < 0.1: self.Dur *= Globals.Rnd.uniform(0,1) ","course":"ECS145"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. () Here you will write an R S3 class \"stack\", do serve as a stack data structure. (Review: Items are pushed onto the stack, meaning here that they are added to the right end of a vector, and they are popped, meaning deleted from the right end.) Here is a sample session: > source(\"stack.R\") # read in code > ls() [1] \"pop\" \"push\" \"stack\" > stack(\"mystack\") # make an object > ls() # did we make the object? [1] \"mystack\" \"pop\" \"push\" \"stack\" > push(3,\"mystack\") # push 3 > mystack data # check vector is now 3,8 [1] 3 8 > pop(\"mystack\") # should return the 8 [1] 8 > mystack data <- vector() tmp data) val <- # blank # blank; insert 2 lines here Solutions: # the stack class will (in this implementation) have objects only at the # global level; an object has 2 member variables, the vector # representing the stack, and the name of the object; values are pushed # onto/popped off of the right end of the data vector # constructor for \"stack\" class; objname is a string, namely a variable # to be created at the top level, i.e. global stack <- function(objname) tmp <- list() tmp objname <- objname class(tmp) <- \"stack\" assign(objname, tmp, envir = .GlobalEnv) # pushes val onto the stack named objname push <- function(val,objname) tmp <- get(objname, envir = .GlobalEnv) tmpdata,val) assign(objname, tmp, envir = .GlobalEnv) # pops the stack named objname, returns the popped value pop <- function(objname) tmp <- get(objname, envir = .GlobalEnv) lng <- length(tmp data[lng] tmpdata[-lng] assign(objname, tmp, envir = .GlobalEnv) return(val) ","course":"ECS145"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. This problem concerns the program psax.py, which monitors processes, in the curses chapter of our book. [(a)] (5) State the number of a line in which an instance variable is accessed. If there is no such line, write NONE. [(b)] (10) State the number of a line in which a class variable is accessed. If there is no such line, write NONE. [(c)] (10) State the number of a line in which a class method is accessed. If there is no such line, write NONE. [(d)] (10) Suppose the file gy consists of a single line with contents 'uuddkkrrState in a SINGLE, BRIEF line what will happen. 2. (30) Consider the binary tree example, Section 1.20. We will add a new method max() to the class treenode. Note since it is a method rather than a freestanding function, it will not conflict with the built-in Python function max(), which works as follows: >>> max(12,5,13) 13 If z is an object of the class treenode, then z.max() will return the maximum value in the tree rooted at z. Example: >>> x = [12,5,13,10,8,6,28] >>> tr = bintree.tree() >>> for n in x: tr.insrt(n) >>> tr.root.max() 28 Fill in the blanks: def max(self): s = blank (a) if blank (b) : s = blank (c) return s 3. (25) Here we will deal with a class representing a vending machine. Each object of this class represents one machine, but all the machines carry the same items (though the current size of the stock of a given item may vary from machine to machine). The inventory variable will be a dictionary with keys being item names and values being the current stocks of those items, e.g. 'Kit Kat':8 signifying that this machine currently holds a stock of 8 Kit Kat bars. The method newstock() adds to the stocks of the given items; e.g. m.newstock('Kit Kat':3,'Sun Chips':2) would record that the stocks of Kit Kat bars and bags of Sun Chips at machine m have been replenished by 3 bars and 2 bags, respectively. Fill in the blanks: class machine: itemnames = [] def __init__(self): self.inventory = blank (a) for nm in blank (b) : self.inventory[nm] = 0 def newstock(self,newitems): for itm in blank (c) : blank (d) += blank (e) 4. (10) This is a continuation of Problem 3. The following test of the above code produces an error: >>> m = machine() >>> machine.itemnames = ['a','b'] >>> m.newstock('b':3) Traceback (most recent call last): File \"\", line 1, in File \"\", line 14, in newstock KeyError: 'b' State in a SINGLE, BRIEF line how to fix this test. Solutions: 1. Note that os and curses are modules, not classes, as can be seen by the fact that they are imported. 1a. NONE 1b. any line containing ``gb.'' 1c. NONE 1d. The psax.py program itself would be killed! 2a. self.value 2b. self.right != None 2c. self.right.max() 3a. 3b. machine.itemnames 3c. newitems.keys() 3d. self.inventory[itm] 3e. newitems[itm] 4. Swap the first two lines. def max(self): # no name conflict s = self.value if self.right != None: s = max(s,self.right.max()) return s class machine: itemnames = [] def __init__(self): # in (itemname, stock) form self.inventory = for nm in machine.itemnames: self.inventory[nm] = 0 # adds the new stock to inventory; items is in dictionary form, # (itemname, newstock form) def newstock(self,newitems): for itm in newitems.keys(): self.inventory[itm] += newitems[itm] # wrong m = machine() machine.itemnames = ['a','b'] m.newstock('b':3) # right machine.itemnames = ['a','b'] m = machine() m.newstock('b':3) ","course":"ECS145"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. This problem concerns the file bookvec.R in our handout. [(a)] (35) Suppose we create a new object z of type \"bookvec\", and then execute z[2] <- z[2] Name the functions in bookvec.R that are executed in these two actions (object creation, assignment). [(b)] (35) One of the comments says, ``note the recycling.'' Give a specific illustration of recycling in the line associated with that comment, involving the object b in the examples in the handout, and explain why. No ``snow jobs,'' please; be specific. Your answer must be in similar form to If we execute > x <- b[1] then the vector (6,1,9) will be recycled to (6,1,9,6,1). [(c)] (30) Write a function sum.bookvec() that will ``overload'' R's generic sum() function. It will return the sum of counts of writes, e.g. > b <- newbookvec(c(5,12,13)) > b[2] <- 8 > b[2] <- 88 > b[3] <- 168 > sum(b) [1] 3 You are allowed a maximum of 5 lines of code. Solutions: 1.a newbookvec() [.bookvec [<-.bookvec 1.b If we execute > b[2:4] <- 7 the one-element vector 1 will be recycled to (1,1,1). 1.c sum.bookvec <- function(bv,na.rm=F) sum(bv ","course":"ECS145"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (60) In this problem, we will develop a function rmnulldirs(). It will remove all empty directories within the specified directory tree. For instance, say rmnulldirs() is invoked on a directory whose only contents is a subdirectory u, which in turn contains further subdirectories v and w, with v empty. Within u and w, there are nondirectory files zu and zw, respectively. The result will be that v will be removed, with nothing else changed. Note: os.rmdir(drcty) removes the directory whose full path name is in drcty. Fill in the blanks. # descends the directory tree rooted at # rootdir, removing all empty directories import os def checklocalempty(dummyarg,dir,flst): for f in flst: fullf = os.path.join(dir,f) if blank (a) (fullf): subdirfls = blank (b) (fullf) if blank (c) : blank (d) def rmnulldirs(rootdir): fullrootdir = blank (e) (rootdir) blank (f) (fullrootdir,checklocalempty,None) 2. (40) In the code below, we have an iterator that fetches words one at a time from a text file. (It's actually the same as in Section 6.2.3, but as an ordinary iterator, not a generator.) Here is an example: The file x consists of abc de f g h ij klm nopq rstuv We run the code: >>> from wordfetch import * >>> g = wordfetch(open('x')) >>> for word in g: print word ... abc de f g h ij klm nopq rstuv Fill in the blanks: class wordfetch: def __init__(self,fl): self.fl = fl # words remaining in current line self.words = [] def __iter__(self): return self def next(self): if self.words == []: line = blank (a) # check for end-of-file if line == '': blank (b) # remove end-of-line char line = line[:-1] self.words = blank (c) firstword = self.words[0] self.words = blank (d) return firstword # descends the directory tree rooted at rootdir, removing all empty # directories import os def checklocalempty(dummyarg,dir,flst): for f in flst: fullf = os.path.join(dir,f) if os.path.isdir(fullf): subdirfls = os.listdir(fullf) if subdirfls == []: os.rmdir(fullf) def rmnulldirs(rootdir): fullrootdir = os.path.abspath(rootdir) os.path.walk(fullrootdir,checklocalempty,None) def main(): rmnulldirs('.') if __name__ == '__main__': main() class wordfetch: def __init__(self,fl): self.fl = fl # words remaining in current line self.words = [] def __iter__(self): return self def next(self): if self.words == []: line = self.fl.readline() # check for end-of-file if line == '': raise StopIteration # remove end-of-line char line = line[:-1] self.words = line.split() firstword = self.words[0] self.words = self.words[1:] return firstword def main(): f = wordfetch(open('x')) for word in f: print word if __name__ == '__main__': main() ","course":"ECS145"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (30) The table below contains analog pairs between Python and R. Fill in the blanks. rr Python & R & list & vector lambda function & blank (a) dictionary & blank (b) map & blank (c) 2. (15) Fill in the blank: > f <- function(x) x^2 > f function(x) x^2 > print(f) function(x) x^2 > p blank (f) function(x) x^2 3. (55) A graph adjacency matrix consists of 0s and 1s, with a 1 at element (i,j) meaning there is a link from i to j. The function haslinks(adj,target) determines which vertices in the graph have links to a given set of target vertices. Here are examples: > m [,1] [,2] [,3] [,4] [,5] [1,] 1 0 1 0 1 [2,] 1 1 0 0 1 [3,] 1 0 0 1 1 [4,] 0 1 1 1 0 [5,] 1 1 0 1 1 > haslinks(m,c(1,4)) [1] 3 5 > haslinks(m,4) [1] 3 4 5 > haslinks(m,1:2) [1] 2 5 > haslinks(m,c(1,3,5)) [1] 1 In the first call, for instance, we ask which vertices have links to both vertex 1 and vertex 4, and the function reports that vertices 3 and 5 (rows 3 and 5 in the matrix) have that property. Fill in the blanks: haslinks <- function(adj,target) canreachtarget <- function(outlinks) which1s <- which( blank (a) ) tmp <- blank (b) (target,which1s) as.integer( blank (c) (tmp,target)) tmp1 <- apply( blank (d) ) which(tmp1 == 1) Solutions: 1.a anonymous function 1.b list 1.c apply 2. rint.function 3. haslinks <- function(adj,target) canreachtarget <- function(outlinks) which1s <- which(outlinks == 1) tmp <- intersect(target,which1s) as.integer(setequal(tmp,target)) tmp1 <- apply(adj,1,canreachtarget) which(tmp1 == 1) ","course":"ECS145"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (10) Consider the example in Sec. 4.6.1, Nonblocking Sockets. This problem will concern what would happen if we had forgotten to include line 34 in the server code. Treat this code as a very simple ``game.'' Suppose that: there are two clients, run by Person A and Person B; Person A starts playing at least several seconds before Person B; the players will only type chatacters when invited to do so by the prompt; Person A is planning to type 'a', then 'b', then quit the game, while Person B is planning to type 'x', 'y' and 'z', then quit. Give the final value of v, output by the server. 2. (30) The function findtwins(x) below returns a Python list of all indices i for which x[i] = x[i+1] (excluding the last element of x, for which there is no right-hand neighbor). It is assumed that none of the (original) elements of x has the value None. Example: >>> x = [12, 5, 13, 13, 3, 4, 5, 5] >>> findtwins(x) [2, 6] >>> findtwins([1,12,5,13,13,13,8,8,12]) [3, 4, 6] Fill in the blanks: def findtwins(x): nx = len(x) x.append(None) def compxii1(i): if x[i] == x[i+1]: return blank (a) else: return -1 indcs = map(compxii1,blank (b) )) # indcs now consists of the found # indices and -1s del x[nx] return filter(blank (c) ,indcs) 3. (60) Below is SimPy code that simulates the operation of a disk drive. Recall how such a system works: Data is stored in concentric rings called tracks. Each track is divided into sectors, and each read/write operation is done on a sector. Time needed to fulfill a disk access request consists first of a seek, in which the read/write head is moved to the desired track, then a rotational delay during which the desired sector rotates around to the head, and finally a data transfer time to process the sector. In the simple model here, we assume the latter is negligible. The time to go from the innermost to outermost track is endtoendsk, and the time for a full rotation is onerot. We model the track number as a continuous variable ranging from 0 (innermost track) to 1 (outermost track), and the track number requested by a job is modeled as a random number between 0 and 1. Similar statements hold for the sector number. Fill in the blanks: import sys import math from SimPy.Simulation import * from random import Random,expovariate,random class gb: # globals rnd = Random(12345) ddrproc = None arrvproc = None class ddr(Process): def __init__(self,endtoendsk,onerot): Process.__init__(self) self.endtoendsk = endtoendsk self.onerot = onerot self.currtrack = 0.5 # state is used for code logic and debugging self.state = 'resting' # not processing a request self.nrequestdone = 0 self.queue = [] def Run(self): onerot = self.onerot endtoendsk = self.endtoendsk while True: if self.queue == []: self.state = 'resting' blank (a) self.state = 'seeking' job = blank (b) yield hold,self, blank (c) self.currtrack = job sector = gb.rnd.random() currangle = math.fmod(now(),onerot) / onerot tmp = sector - currangle if tmp > 0: rotdelay = tmp * self.onerot else: rotdelay = (1-currangle+sector)*self.onerot self.state = 'waiting rotation' yield hold,self,rotdelay blank (d) class arrivals(Process): def __init__(self,arrvrate): Process.__init__(self) self.arrvrate = arrvrate def Run(self): while True: yield hold,self,gb.rnd.expovariate(self.arrvrate) track = gb.rnd.random() gb.ddrproc.queue.append(track) if gb.ddrproc.state == 'resting': blank (f) def main(): initialize() ees = float(sys.argv[1]) oner = float(sys.argv[2]) d = ddr(ees,oner) gb.ddrproc = d activate(d,d.Run()) arrvrate = float(sys.argv[3]) a = arrivals(arrvrate) gb.arrvproc = a activate(a,a.Run()) maxsimtime = float(sys.argv[4]) simulate(until=maxsimtime) print 'throughput:',d.nrequestdone/maxsimtime Solutions: 1. Answer will depend somewhat on order of players' ``moves.'' But any answer beginning with 'ax' and ending with 'z' is OK. 2. # goes through the list x and returns all indices i for which # x[i] = x[i+1]; elements in x are assumed to not be None def findtwins(x): nx = len(x) x.append(None) def compxii1(i): if x[i] == x[i+1]: return i else: return -1 indcs = map(compxii1,range(nx)) del x[nx] return filter(lambda u:u >= 0,indcs) x = [12, 5, 13, 13, 3, 4, 5, 5] findtwins(x) 3. # Disk.py # usage # python Disk.py endtoendsk onerot arrvrate maxsimtime import sys import math from SimPy.Simulation import * from random import Random,expovariate,random class gb: # globals rnd = Random(12345) ddrproc = None arrvproc = None # each instance of the ddr class will simulate one disk drive (but we # will have only one here) # current track is modeled as continuous value between 0 and 1 # endtoendsk is time to go from innermost to outermost track; onerot # is time for one rotation; data transfer time assumed negligible here class ddr(Process): def __init__(self,endtoendsk,onerot): Process.__init__(self) self.endtoendsk = endtoendsk self.onerot = onerot self.currtrack = 0.5 # state is used for code logic and debugging self.state = 'resting' self.nrequestdone = 0 self.queue = [] def Run(self): onerot = self.onerot endtoendsk = self.endtoendsk while True: if self.queue == []: self.state = 'resting' yield passivate,self self.state = 'seeking' job = self.queue.pop(0) yield hold,self,abs(job-self.currtrack)*endtoendsk self.currtrack = job sector = gb.rnd.random() currangle = math.fmod(now(),onerot) / onerot tmp = sector - currangle if tmp > 0: rotdelay = tmp * self.onerot else: rotdelay = (1-currangle+sector)*self.onerot self.state = 'waiting rotation' yield hold,self,rotdelay self.nrequestdone += 1 class arrivals(Process): def __init__(self,arrvrate): Process.__init__(self) self.arrvrate = arrvrate def Run(self): while True: yield hold,self,gb.rnd.expovariate(self.arrvrate) track = gb.rnd.random() gb.ddrproc.queue.append(track) if gb.ddrproc.state == 'resting': reactivate(gb.ddrproc) def main(): initialize() ees = float(sys.argv[1]) oner = float(sys.argv[2]) d = ddr(ees,oner) gb.ddrproc = d activate(d,d.Run()) arrvrate = float(sys.argv[3]) a = arrivals(arrvrate) gb.arrvproc = a activate(a,a.Run()) maxsimtime = float(sys.argv[4]) simulate(until=maxsimtime) print 'throughput:',d.nrequestdone/maxsimtime ","course":"ECS145"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (100) Below is a variant of the client-server example in Section 5.1.1.1, again using Python's thread module. Here, the number of clients increases and decreases over time, starting with none. When we again reach a situation with no clients, the server prints v and exits. Fill in the blanks. # multiple clients connect to server; clients come and go, # starting with none, but the server quits later when # wthere are no clients left; each client repeatedly sends a # letter k, which the server adds to a global string v and # echos back to the client; k = '' means # the client is # dropping out; when all clients are gone, server prints # final value of v # this is the server import socket import sys import blank (a) class gb: blank (b) v = '' threadslist = [] # client sockets firstclntyet = False port = int(sys.argv[1]) def serveclient(sock): if blank (c) : while True: k = sock.recv(1) if k == '': break gb.vlock.acquire() gb.v += k gb.vlock.release() blank (d) sock.close() blank (e) else: lstn = socket.socket(socket.AF_INET,socket.SOCK_STREAM) lstn.bind(('', gb.port)) lstn.listen(5) while True: (clnt,ap) = lstn.accept() blank (f) gb.firstclntyet = True thread.start_new_thread(serveclient,(clnt,)) lstn.close() def main(): thread.start_new_thread(serveclient,(None,)) while blank (g) : pass while blank (h) : pass print 'the final value of v is', gb.v if __name__ == '__main__': main() Soluitons: 1. # multiple clients connect to server; # clients come and go, but the # server quits # when there are no clients; each client # repeatedly sends a letter k, which the # server adds to a global string v and # echos back to the client; k = '' means # the client is dropping out; when all clients # are gone, server prints final value of v # this is the server import socket import sys import thread class gb: vlock = thread.allocate_lock() v = '' threadslist = [] firstclntyet = False port = int(sys.argv[1]) def serveclient(sock): if sock: while True: k = sock.recv(1) if k == '': break gb.vlock.acquire() gb.v += k gb.vlock.release() sock.send(gb.v) sock.close() gb.threadslist.remove(sock) else: lstn = socket.socket(socket.AF_INET,socket.SOCK_STREAM) lstn.bind(('', gb.port)) lstn.listen(5) while True: (clnt,ap) = lstn.accept() gb.threadslist.append(clnt) gb.firstclntyet = True thread.start_new_thread(serveclient,(clnt,)) lstn.close() def main(): thread.start_new_thread(serveclient,(None,)) while not gb.firstclntyet: pass while gb.threadslist: pass print 'the final value of v is', gb.v if __name__ == '__main__': main() ","course":"ECS145"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (10) With the new reference classes, R has been moving somewhat away from its philosophy of having no . 2. (30) The pdist() function in the R package of the same name computes distances of rows of a matrix A and rows of a matrix B. (For our purposes, they will be distinct matrices.) Say A has n rows and B has p rows (the two matrices must have the same number of columns). Then the major output of pdist() is an matrix, with the element giving the distance from row of A to row of B. However, these distances are stored in linear form, in row-major order, i.e. first all distances from row 1 of A are stored, then all distances from row 2 of A, etc. Here is an example: > a [,1] [,2] [,3] [,4] [1,] 0 1 1 1 [2,] 1 0 0 1 > b [,1] [,2] [,3] [,4] [1,] 1 1 0 1 [2,] 0 0 0 1 > str(pdist(a,b)) Formal class 'pdist' [package \"pdist\"] with 4 slots ..@ dist : atomic [1:4] 1.41 1.41 1 1 .. ..- attr(*, \"Csingle\")= logi TRUE ..@ n : int 2 ..@ p : int 2 ... The function below takes a pdist object pdout and returns the distances in R matrix form, again the numbers in row 1 being distances from row 1 of A to rows of B. For example, > pdtomat(pdist(a,b)) [,1] [,2] [1,] 1.414214 0 [2,] 1.732051 1 Fill in the blank. NOTE: Write this as ``1'' in your quiz file, not ``1a''. pdtomat <- function(pdout) n <- pdout@n p <- pdout@p blank (a whole line) 3. (60) The R head() generic function prints the first few pieces of the object it is called on. For vectors, this is the first few elements; for matrices and data frames, it is the first few rows. The default view of ``few'' is 6. Here we will extend head() to objects of class \"ut\" in Section 12.3.2. Fill in the blanks: blank (a) <- function(utmat,k=6) n <- blank (b) k <- blank (c) for (row in 1:k) zeros <- blank (d) cat(zeros,\" \") for (col in row:n) rowcolval <- blank (e) cat(rowcolval,\" \") cat(\"\") Solutions: 1. side effects 2. pdtomat <- function(pdout) n <- pdout@n p <- pdout@p matrix(pdout@dist,byrow=TRUE,ncol=p) 3. head.ut <- function(utmat,k=6) n <- length(utmat mat[utmat ","course":"ECS145"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. SHOW YOUR WORK! Any arithmetical answer must be expressed as a common fraction (e.g. 2/3, 7/4), reduced to lowest terms. 1. (25) Urn I contains three blue marbles and three yellow ones, while Urn II contains five and seven of these colors. We draw a marble at random from Urn I and place it in Urn II. We then draw a marble at random from Urn II. Let denote the event that the first marble drawn is blue, denote the event that the second marble drawn is yellow, and so on. Fill in the blanks with equation numbers which will serve as reasons for the steps, 2. (25) Fill in the blanks (and only those blanks, no extra code elsewhere) in the following R code, which returns the (approximate) probability in (2.36) in the board game example: [numbers=left] boardsim <- function(nreps) count4 <- 0 countbonusgiven4 <- 0 for (i in 1: ) # blank 1 position <- sample(1:6,1) if (position == 3) # blank 2 position <- (position + sample(1:6,1)) # blank 3 if (position == 4) # blank 4 if (bonus) countbonusgiven4 <- countbonusgiven4 + 1 return( ) # blank 5 3. (25) Suppose the random variable X takes on only the values 0 and 1. Fill in the blank with either , , , , or : EX P(X = 1) 4. (25) Again for the board game example, suppose that the telephone report is that A ended up at square 1 after his first turn. Find the probability that he got a bonus. Solutions: 1. (2.2), (2.5) 2. boardsim <- function(nreps) count4 <- 0 countbonusgiven4 <- 0 for (i in 1:nreps) position <- sample(1:6,1) if (position == 3) bonus <- TRUE position <- (position + sample(1:6,1)) else bonus <- FALSE if (position == 4) count4 <- count4 + 1 if (bonus) countbousngiven4 <- countbousngiven4 + 1 return(countbousngiven4/count4) 3. The answer is , since 4. Landing at square 1 after one turn means R+B is either 1 or 9. Let T = R + B. ","course":"ECS132"} {"quiz":" language=R,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. X be the number of dots we get in rolling a three-sided die once. (It's cylindrical in shape.) The die is weighted so that the probabilities of one, two and three dots are 1/2, 1/3 and 1/6, respectively. Note: Express all answers in this problem as common fractions, reduced to lowest terms, such as 2/3 and 9/7. [(a)] (10) State the value of . [(b)] (10) Find EX and Var(X). [(c)] (15) Suppose you win 2 for each dot. Find EW, where W is the amount you win. 2. This problem concerns the REVISED version of the committee/gender example. [(a)] (10) Find . Express your answer as an unsimpilfied expression involving combinatorial quantities such as . [(b)] (15) Find . Express your answer as a common fraction. 3. (15) State the (approximate) return value for the function below, in terms of w. You must cite an equation number in the book to get full credit. [numbers=left] xsim <- function(nreps,w) sumn <- 0 for (i in 1:nreps) n <- 0 while (TRUE) n <- n + 1 u <- runif(1) if (u < w) break sumn <- sumn + n return(sumn/nreps) 4. (15) Consider the parking space example on p.48. (NOT the variant in the homework.) Let N denote the number of empty spaces in the first block. State the value of Var(N), expressed as a common fraction. 5. (10) Suppose X and Y are independent, with variances 1 and 2, respectively. Find the value of c that minimizes Var[cX + (1-c)Y]. Solutions: 1.a 1/3 1.b * * 1.c EW = E(2X) = 2 EX = 10/3 2.a * 2.b * 3. 1/w, by (3.74) 4. 10(0.2)(1-0.2) = 8/5, by (3.82) 5. * So, the best c is 2/3. ","course":"ECS132"} {"quiz":" xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. This problem concerns the bus ridership example, which begins in Sec. 2.11 and is analyzed via simulation in Sec. 2.12.4. [(a)] (25) Find . [(b)] (20) Suppose the company charges 3 for passengers who board at the first stop, but charges 2 for those who join at the second stop. (The latter passengers get a possibly shorter ride, thus pay less.) So, the total revenue from the first two stops is . We want to find E(T), and the question is whether we can calculate it by first writing then using our answer from (a) above, and then reasoning that . Which of the following is correct? [(i)] The method proposed above is valid. (If you choose this answer, you must also state the numbers of the relevant ``mailing tubes.'') [(ii)] The above method is invalid, because is not necessarily equal to . [(iii)] , but the above method is invalid for other reasons. [(c), (d)] (20) (Note that the following concerns both part (d) and part (d).) Suppose on p.24 we wish to add code to find , not just as we are already doing. We'll need to insert two new lines of code for this (not counting another print() call after line 17). State what these two lines are, for your answers to (c) and (d). Include a comment, saying where the insertions should be made. Example: If the code x <- y + 3 should go between lines 8 and 9, write [fontsize=-2] x <- y + 3 # insert between lines 8 and 9 2. Twenty tickets are sold in a lottery, numbered 1 to 20, inclusive. Five tickets are drawn for prizes. [(a)] (25) Find the probability that two of the five winning tickets are even-numbered. (You may call built-in R functions, e.g. sqrt() in your answer.) [(b)] (10) Find the probability that two of the five winning tickets are in the range 1 to 5, two are in 6 to 10, and one is in 11 to 20. (You may call built-in R functions, e.g. sqrt() in your answer.) Solutions: 1.a 1.b Answer (i) is correct, using (3.13) (taking and ) and then (3.14). 1.c,d [numbers=left] totl10 <- 0 # insert between 3 and 4 totl10 <- totl10 + passengers # insert between 15 and 16 2.a [numbers=left] choose(10,2) * choose(10,3) / choose(20,5) 2.b [numbers=left] choose(5,2) * choose(5,2) * choose(10,1) / choose(20,5) ","course":"ECS132"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. SHOW YOUR WORK! 1. (15) Exercise 7(b), Chapter 4, p.97. Give your answer as a decimal or common fraction. 2. (15) Exercise 8(a), Chapter 4, p.97. Give your answers as decimal or common fractions. 3. (20) Suppose X has a uniform distribution on the interval (20,40), and we know that X is greater than 25. What is the probability that X is greater than 32? Give your answer as a common fraction. 4. (25) Suppose U and V have the 2t/15 density on p.74. Let N denote the number of values among U and V that are greater than 1.5. (CORRECTED SUBSEQUENT TO QUIZ.) (So N is either 0, 1 or 2.) Find Var(N), expressing your answer as a decimal or common fraction. 5. (25) What is the (approximate) value returned from the following R code? [fontsize=-2] mean((rnorm(10000,mean=28,sd=5))^4) Your answer must be expressed as a definite integral. Solutions: 1 2. Let X be the error. On p.75, we have r = 0.5, q = -0.5. Using the formulas for the mean and variance at the bottom of p.75, we have 3. Because of the uniformity, . Following the pattern on p.79, we have 4. N has a binomial distribution with n = 2 and So, (once again) using (3.82), we have 5. The simulation is calculating , where X has a normal distribution with mean 28 and standard deviation 5. That expected value, by (4.21), is ","course":"ECS132"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), etc. 1. This problem deals with the ALOHA simulation example on p.23. [(a)] (20) What variable in the program is analogous with the number of lines in our ``notebook'' view? [(b)](20) Suppose we want to add code to find P(collision during epoch 2). We'll set a variable ce2 to 0 early in the code, and near the end, we'll divide it by nreps. We'll need to add one more line of code. State the line number after which the new line is to be inserted, and state what code goes there. Sample answer: after line 10 insert \"if (x == 0) y <- 3\" 2. This problem involves the bus ridership example in Section 2.11. [(a)] (20) Find the probability that no passengers board the bus at the first three stops. [(b)] (20) Suppose it is observed that the bus arrives empty at the third stop. What is the probability that exactly two people boarded the bus at the first stop? [(c)] (20) Suppose we wish to find via simulation, by modifying the program on p.24. Say we initialize to 0 a variable named tl2 near the beginning of the program, and will divide it by nreps near the end of the code. State the line number after which the new line is to be inserted, and state what code goes there. Solutions: 1.a nreps 1.b After line 29, insert if (numsend == 2) ce2 <- ce2 + 1 2.a 0.5^3 2.b The event of the bus arriving empty at stop 3 is the same as . Then we have: where the numbers in the last step come from p.19. 2.c After line 12, insert: if (j == 2) tl2 <- tl2 + passengers ","course":"ECS132"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Unless otherwise stated, give numerical answers as expressions, e.g. . Do NOT use calculators. 1. (20) Consider the ALOHA example from the text, for general p and q, and suppose that , i.e. there are no active nodes at the beginning of our observation period. Find . 2. Consider a three-sided die, as opposed to the standard six-sided type. The game is to keep rolling the die until we get a total of at least 3. Let N denote the number of times we roll the die. For example, if we get a 3 on the first roll, N = 1. If we get a 2 on the first roll, then N will be 2 no matter what we get the second time. The largest N can be is 3. The rule is that one wins if one's final total is exactly 3. [(a)] (20) Find the probability of winning. [(b)] (20) Find P(our first roll was a 1 we won). [(c)] Extra Credit: How could we construct such a die? 3. Consider the ALOHA simulation example on pp.11-12. [(a)] (20) Suppose we wish to find instead of . What line(s) would we change, and how would we change them? [(b)] (20) In which line(s) are we in essence checking for a collision? Solutions: 1. 2a. 2b. Then divide. 2c. For example, construct the die as a cylinder, with the proper ratio of height to radius to achieve the right balance. 3a. Line 34, writing X2 == 1, and making the same change in the output labeling in line 40. (Latter not counted wrong if missing.) 3b. Line 13. ","course":"ECS132"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. SHOW YOUR WORK! 1. (20) Suppose we roll our usual three-sided die, with probabilities 1/2, 1/3 and 1/6 of coming up 1, 2 or 3 dots, respectively. Let X denote the number of dots. Find . Express your answer as a single common fraction. 2. (25) Write R code (but not simulation) that computes the value of 3. (25) Consider the disk performance example on p.76. We will scale things so that the track number is a continuous value in [0,1]. Fill in the gaps in the following code, which finds the (approximate) mean time to satisfy a disk access request. The arguments fullsweep and fullrotate are the time needed to go from track 0.0 to track 1.0, and the time needed to make one revolution of the disk, respectively. [fontsize=-2] disksim <- function(naccesses,fullsweep,fullrotate) currtrack <- 0.5 oldtrack <- 0.5 sumacctime <- 0.0 for (i in 1:naccesses) currtrack <- # gap seek <- abs(currtrack - oldtrack) oldtrack <- currtrack seektime <- # gap rottime <- # gap sumacctime <- # gap return( ) # gap 4. Consider the following variant of the bus ridership example on p.20 and our current homework. The probability that a passenger alights is now q instead of 0.2, and the number of new passengers who wish to board the bus at a stop, N, is now assumed to have a Poisson distribution with parameter . The capacity of the bus is still c. Answer the following, using expressions in c, q, and the stationary probability vector (you may not need them all). [(a)] (15) Find the transition probabilities and . [(b)] (15) Let S denote the number of stops that a passenger travels. If for instance she boards at stop 3 and alights at stop 8, then S = 5. Find Var(S) Solutions: 1. 2. [fontsize=-2] pnorm(30,mean=28,sd=5) - pnorm(27,mean=28,sd=5) 3. [fontsize=-2] disksim <- function(naccesses,fullsweep,fullrotate) currtrack <- 0.5 oldtrack <- 0.5 sumacctime <- 0.0 for (i in 1:naccesses) currtrack <- runif(1) seek <- abs(currtrack - oldtrack) oldtrack <- currtrack seektime <- seek * fullsweep rottime <- runif(0,fullrotate) sumacctime <- seektime + rottime return(sumacctime/naccesses) 4.a For , that transition will occur if 0 board, which has probability For the case , either 1 alights and 0 board or 2 alight and 1 boards, so we have 4.b S has a geometric distribution with parameter q, so by (3.75), . ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), etc. 1. Consider the derivation on p.19 of our text, starting with (2.48). We would like to add reasons for the steps, as in (2.9) and (2.10). Write your answers as equation numbers, e.g. (8.88). (Do not include the English word ``equation.'' Remember, this will be graded by computer, and in this problem the script will assume a numerical answer.) [(a)] (10) Give the ``mailing tube'' for (2.48). [(b)] (15) Give the ``mailing tube'' for (2.49). 2. Suppose three fair dice are rolled. We wish to find the approximate probability that the first die shows fewer than 3 dots, given that the total number of dots for the 3 dice is more than 8, using the code below. Fill in the blanks with a single line of code in each case. [(a)] (15) Fill in Line 5. [(b)] (15) Fill in Line 8. [(c)] (15) Fill in Line 11. [numbers=left] dicesim <- function(nreps) count1 <- 0 count2 <- 0 for (i in 1:nreps) if (sum(d) > 8) count1 <- count1 + 1 3. Consider the bus ridership example, in Sec. 2.11 of our text. [(a)] (15) Find the probability that fewer people board at the second stop than at the first. [(b)] (15) Someone tells you that as she got off the bus at the second stop, she saw that the bus then left that stop empty. Find the probability that she was the only passenger when the bus left the first stop. Solutions: 1.a (2.2) 1.b (2.5) 2. [numbers=left] dicesim <- function(nreps) count1 <- 0 count2 <- 0 for (i in 1:nreps) d <- sample(1:6,3,replace=T) if (sum(d) > 8) count1 <- count1 + 1 if (d[1] < 3) count2 <- count2 + 1 return(count2 / count1) 3.a 3.b We are given that . But we are also given that . Then ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), integrate()etc. 1. This problem concerns the Markov inventory model, Sec. 4.6, but with a little change: When replenishment occurs, it might not be a full shipment. A proportion 0.2 of the time, the store's supplier has a shortage, and delivers only r/2 new items rather than r. The function inventory() will be revised accordingly. Assume that r is an even number, and that a statement r2 <- r / 2 has been inserted between the lines containing calls to function() and matrix(). [(a)] (15) The line tm[2,r] <- q must be altered. Show what the new line should be. [(b)] (15) In addition, a new line of code that assigns to some element of row 2 in tm must be inserted. Show what that new line should be. (Don't worry about where it will be inserted, or what modifications, if any, need to be made concerning other rows.) [(c)] (15) Give a loop-free R expression, using matrix and/or vector operations, for the long-run average amount of stock. It will involve pivec, the vector. 2. Consider the counterexample to the statement Cov(X,Y) = 0 X,Y independent starting at the bottom of p.138. Instead of using the unit disk, we might take various other regions for our counterexample to the above-displayed statement. In each case below, answer either C or NC (no quotation marks necessary) according to whether the proposed distribution would indeed provide a counterexample. [(a)] (5) (X,Y) has a uniform distribution on the rectangle with corners at (-1,-8), (-1,8), (1,-8) and (1,8). [(b)] (5) (X,Y) has a uniform distribution on the diamond with corners at (-1,0), (0,-8), (1,0) and (0,8). [(c)] (5) (X,Y) has a uniform distribution on the ellipse having axis intercepts at (-1,0), (0,-8), (1,0) and (0,8). 3. (10) Consider the dice correlation example in Sec. 7.2.2.1, but with a twist: The blue die is an ordinary, fair die, but the yellow one is heavily weighted toward the extremes: 1 and 6 have higher probabilities than 2, 3 and 4. The probabilities are still symmetric, so that i has the same probability as 7-i. The question at hand is, how would this affect ? Choose one: [(i)] The correlation would be larger than 0.707. [(ii)] The correlation would be smaller than 0.707. [(iii)] The correlation would still be 0.707. [(iv)] There is not enough information to tell. 4. (10) Suppose we model light bulb lifetimes as having a normal distribution with mean and standard deviation 500 and 50 hours, respectively. Give a loop-free R expression for finding the value of d such that 30 of all bulbs have lifetime more than d. 5. (10) Say , with the being i.i.d. (a very frequently-used term; see beginning of Sec. 5.5.4.5) with uniform distributions on (0,1). Give an R expression for the approximate value of . 6. (10) Figure 5.2 is a nice illustration of the Central Limit Theorem. The curves are more and more bell-shaped as r grows (and would be even more so with still larger r). But in view of the formal statement of the CLT, the bell-shaped nature of those curves does not actually follow from the CLT. Explain why, in a single short line. Solutions: 1.a tm[2,r] <- 0.8 * q 1.b tm[2,r2] <- 0.2 * q 1.c pivec 2. NC, C, C 3. Now , so would increase. Answer (ii) is correct. 4. qnorm(1-0.30,500,50) 5. W has an approximate normal distribution, with mean and variance . So we need pnorm(23.4,25,sqrt(50/12)) 6. Theorem 15---the formal statement of the CLT---merely says that the cdfs converge, not that the densities converge. In this case, the densities actually do converge, but in some other situations they do not. The cdfs could be very \"kinky\" so that the densities don't converge, even though the cdfs do. ","course":"ECS132"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Unless otherwise stated, give numerical answers as expressions, e.g. . Do NOT use calculators. 1. (20) Fill in the blank: Density functions for continuous random variables are analogs of the functions that are used for discrete random variables. 2. (20) Suppose for some random variable W, for , with be 0 and 1 for and , respectively. Find for . 3. (10) Suppose X has a binomial distribution with parameters n and p. Then X is approximately normally distributed with mean np and variance np(1-p). For each of the following, answer either A or E, for ``approximately'' or ``exact,'' respectively: [(a)] (10) distribution of X is normal [(b)] (10) E(X) is np [(c)] (5) Var(X) is np(1-p) 4. Suppose light bulb lifetimes have an exponential distribution with mean 100.0 hours, i.e. . We use our first light bulb, with it lasting for hours. When it burns out, we replace it with a second bulb, which lasts hours. Then is the time of the second replacement. [(a)] (10) Give numerical expressions for the mean and variance of . [(b)] (5) State (the actual function, not the name of a family etc.). [(c)] (10) Fill in the blank: Solutions: 1. probability mass functions 2. 3a. A (pages 30-31) 3b. E (pages 30-31) 3c. E (pages 30-31) 4a. , (page 59) 4b. (page 59) 4c. (like (2.38)) ","course":"ECS132"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. SHOW YOUR WORK! 1. (10) In the ALOHA example, pp.9-12, find . Your answer must consist of a single number found in those pages. 2. Find the following quantities for the density (5.17). In all cases, do NOT evaluate any integrals; leave your answer in integral form. [(a)] (10) [(b)] (10) [(c)] (15) 3. Find the following quantities for the dice example on p.115. In all cases, leave your answers as numerical expressions, e.g. . Feel free to cite (3.29). [(a)] (10) Cov(X,2S) [(b)] (10) Cov(X,S+Y) [(c)] (10) Cov(X+2Y,3X-Y) [(d)] (10) 4. (15) Consider the ``Senthi'' example, pp.118-119. Let R denote the time it takes to go from state 1 to state 3. Find . Leave your answer in integral form. Solutions: 1. 0.20 (see just below (2.14)) 2a. 2b. 2c. By definition, The region in question has an irregular shape, so the so the answer is a two-part integral: 3a. This and the next two parts make use of (5.22) and other mailing tubes. using (5.72). 3b. 3c. In (5.22), take a = 1, b = 2, c = 3 and d = -1. Then use the fact that Cov(X,X) = Var(X), etc. The result is 4. Write , where is the time to go from state i to state i+1. are independent, due to the Markov/memoryless property, and as the example points out, has an exponential distribution with parameter i(g-i). So, we are basically in the same situation as the backup battery example in Sec. 5.5.4, and where . ","course":"ECS132"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Unless otherwise stated, give numerical answers as expressions, e.g. . Do NOT use calculators. 1. (15) Using Equation (1.64), give a numerical expression for . 2. Suppose X and Y are independent random variables with standard deviations 3 and 4, respectively. [(a)] (10) Find Var(X+Y). [(b)] (10) Find Var(2X+Y). 3. (30) Fill in the blanks in the following simulation, which finds the approximate variance of N, the number of rolls of a die needed to get the face having just one dot. [fontsize=-2] onesixth <- 1/6 sumn <- 0 sumn2 <- 0 for (i in 1:10000) n <- 0 while(TRUE) ________________________________________ if (______________________________ < onesixth) break sumn <- sumn + n sumn2 <- sumn2 + n^2 approxvarn <- ____________________________________________ cat(\"the approx. value of Var(N) is \",approx,\"\") 4. (20) Jack and Jill keep rolling a four-sided and three-sided die. The first player to get the face having just one dot wins, except that if they both get a 1, it's a tie, and play continues. Let N denote the number of turns needed. Find . 5. (15) Let X be the total number of dots we get if we roll three dice. Find an upper bound for , using our course materials. Solutions: 1. . The last term is , and the next-to-last is . 2. By (1.61), . By (1.48), , so again by (1.61), . 3. [fontsize=-2] n <- n + 1 runif(1) sumn2/10000 - (sumn/10000)^2 4. * 5. Use Markov's Inequality: (Of course, it's a very poor bound in this case.) ","course":"ECS132"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(). 1. This problem concerns the bus ridership example, which begins in Sec. 2.11. [(a)] (20) The bus driver has the habit of exclaiming, ``What? No new passengers?!'' every time he comes to a stop at which . Let N denote the number of the stop (1,2,...) at which this first occurs. Find P(N = 3). [(b)] (20) Find Var(N) in (a). [(c)] (20) Let T denote the number of stops, out of the first 6, at which 2 new passengers board. For example, T would be 3 if , , , , , and . Find . [(d)] (20) The bus ridership problem is simulated in Section 2.12.4. Give a single call to a built-in R function that can replace lines 8-11. 2. (20) A machine contains one active battery and two spares. Each battery has a 0.1 chance of failure each month. Let L denote the lifetime of the machine, i.e. the time in months until the third battery failure. Find P(L = 12). Solutions: 1.a N has a geometric distribution, with p = probability of 0 new passengers at a stop = 0.5. Thus , by (3.75). 1.b , by (3.84). 1.c T has a binomial distribution, with n = 6 and p = probability of 2 new passengers at a stop = 0.1. Then Note that your electronic answer could be [numbers=left] dbinom(4,6,0.1) 1.d Given that k passengers are currently on the bus, the number who alight has a binomial distribution, with n = k and p = 0.2. The code in lines 8-11 is merely simulating the k trials, in this case with k being the variable named passengers. Thus the code could be replaced by [numbers=left] passengers <- passengers - rbinom(1,passengers,0.2) 2. The number of months until the third failure has a negative binomial distribution, with r = 3 and p = 0.1. Thus the answer is obtained by (3.111), with k = 12: ","course":"ECS132"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Unless otherwise stated, give numerical answers as expressions, e.g. . Do NOT use calculators. 1. (15) Consider random variables and , for which for i = 1,2, and . Find . 2. (15) Suppose we have random variables X and Y, and define the new random variable Z = 8Y. Then which of the following is correct? (i) . (ii) . (iii) . (iv) . (v) . (vi) There is no special relationship. 3. Suppose for and the density is 0 elsewhere. [(a)] (10) Find . [(b)] (10) Which statement concerning this distribution is correct? (i) IFR. (ii) DFR. (iii) U-shaped failure rate. (iv) Sinusoidal failure rate. (v) Failure rate is undefined for . 4. (15) Consider the coin game on p.33. Find . 5. (15) In the backup battery example on p.85, find Var(W). 6. (10) Consider the ``8st'' density example on p.74. Find . Express your answers as a definite integral, ready for any calculus student to compute an actual number from. 7. (10) What will be the (approximate) output of the following R code? [fontsize=-2] s <- 0 s2 <- 0 for (rep in 1:10000) z3 <- rnorm(3) # generate 3 N(0,1) random variates tot <- sum(z3^2) # sum of the squares of the 3 variates s <- s + tot s2 <- s2 + tot^2 m <- s/10000 print(m) print(s2/10000 - m^2) Solutions: 1. 3 2. (i) 3a. , so 3b. IFR 4 5. 6. 7. Using Sections 2.3.3.1 and 2.3.5.1, and (1.4.6), we have that the output will be 3 and 6. ","course":"ECS132"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Unless otherwise stated, give numerical answers as expressions, e.g. . Do NOT use calculators. 1. Suppose the random vector has mean and covariance matrix [(a)] (10) Fill in the three missing entries. [(b)] (10) Find . [(c)] (10) Find . [(d)] (10) Find . [(e)] (15) Find the covariance matrix of . [(f)] (15) If in addition we know that has a normal distribution, find , in terms of . [(g)] (15) Consider the random variable . Which of the following is true? (i) . (ii) . (iii) . (iv) In order to determine which of the two variances is the larger one, we would need to know whether the variables have a multivariate normal distribution. (v) doesn't exist. 2. (15) What is the (approximate) output of this R code: [fontsize=-2] count <- 0 for (i in 1:10000) count1 <- 0 count2 <- 0 count3 <- 0 for (j in 1:20) x <- runif(1) if (x < 0.2) count1 <- count1 + 1 else if (x < 0.6) count2 <- count2 + 1 else count3 <- count3 + 1 if (count1 == 9 && count2 == 2 && count3 == 9) count <- count + 1 cat(count/10000) Solutions: 1a. 1b. -0.2 1c. 1d. 3 1e. 1f. 1g. (ii), by (3.29) 2. ","course":"ECS132"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(). 1. Consider the random variable X on p.92. [(a)] (15) Find the probability that X is between 0.1 and 0.2. [Should have been 1.1 and 1.2. Otherwise the probability is 0.] [(b)] (15) Find . 2. Consider the ALOHA Markov chain example, beginning on p.68, but with 4 nodes in the network, not just 2. [(a)] (10) How many rows will the P matrix now have? [(b)] (15) Find , for the case q = 0.2, p = 0.6. 3. Suppose light bulb lifetimes are exponentially distributed with mean 10.0 months. We try them one at a time, until we find the third one that lasts longer than 5.0. Let N denote the number of light bulbs we try. [(a)] (15) What famous parametric family does the distribution of N belong to? [(b)] (15) Find Var(N). 4. (15) In the network intrusion example, p.97, suppose Jill logs in twice. Let X and Y denote the number of disk sectors she reads in the two sessions, assumed to be independent. Find . Solutions: 1.a See note in problem statement. Probability is 0 as stated. For 1.1, 1.2: 1.b 2.a 5 2.b 3.a Negative binomial. 3.b From (3.114): Here r = 3 and The integral can be computed by hand, or as 1 - pexp(5.0,0.1) 4. X+Y has a normal distribution with mean and variance . The specified probability is then computed as 1 - pnorm(1088,1000,sqrt(450)) ","course":"ECS132"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), etc. 1. In the R code at the bottom of p.69, suppose we wish to change it to find . Replace each of these lines below. You may remove lines if you wish (do not add any); if so, replace the given line with a comment line, # line removed so that the original line numbers are retained. [(a)] (10) Show the new line 2. [(b)] (10) Show the new line 3. [(c)] (10) Show the new line 4. 2. Coin A has probability 0.6 of heads, Coin B is fair, and Coin C has probability 0.2 of heads. I toss A once, getting X heads, then toss B once, getting Y heads, then toss C once, getting Z heads. Let W = X + Y + Z, i.e. the total number of heads from the three tosses (ranges from 0 to 3). [(a)] (10) Find P(W = 1). [(b)] (10) Find Var(W). 3. This problem concerns the parking example, pp.59-60. [(a)] (15) Find . [(b)] (10) Find P(D = 1). [(c)] (10) Say Joe is the one looking for the parking place. Paul is watching from a side street at the end of the first block (the one before the destination), and Martha is watching from an alley situated right after the sixth parking space in the second block. Martha calls Paul and reports that Joe never went past the alley, and Paul replies that he did see Joe go past the first block. They are interested in the probability that Joe parked in the second space in the second block. Fill in the blank, using only math and probability symbols, N and D---no English except for and, or and not: The probability they wish to find is P(). [(d)] (15) Add to the simulation code on p.60, so that it finds and prints (the latter via print()) the approximate value of P(we park in the first block). You must use only one R statement, though it will probably consist of nested function calls. Hint: See p. 21, bottom. 4. (15) March, April, May and June Each of these is a common woman's name, by the way. each roll a die until an event occurs: For March, the event is to roll a 3; for April, a 4; for May, a 5; and for June, a 6. Let T denote the total number of rolls they make. Find P(T = 28). Solutions: 1. for (i in 0:4) # line removed prob <- prob + dbinom(i,4,0.5) * dbinom(i,6,0.5) 2a. * 2b. Var(W) = Var(X) + Var(Y) + Var(Z), by independence. Since X is an indicator random variable, , etc. The answer is 3a. 3b. * 3c. 3d. print(mean(nvalues <= 10)) 4. Actually, it doesn't matter what the different women's numerical goals are; the probability would be the same even if each woman was rolling until she got, say, a 5. The random variable T is then a sum of 4 independent geometrically-distributed random variables, each having the parameter p = 1/6. As noted in the material surrounding (3.109), such a sum has a negative binomial distribution. Thus P(T = 28) is computed as choose(28-1,4-1) * (1-1/6)^(28-4) (1/6)^4 ","course":"ECS132"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Unless otherwise stated, give numerical answers as expressions, e.g. . Do NOT use calculators. 1. (15) What is the relation of the value of printed out by R's var() function to the value I use? (Assume neither is 0.) (i) My value is larger. (ii) R's value is larger. (iii) They are equal. (iv) The var() function has no relation to my ; they just have similar names. 2. Consider the R code on p.125. [(a)] (15) Of all the variables in that code, which one---if any---corresponds to the ``number of lines in the notebook''? [(b)] (15) Which R expression in that code is a standard error? And for which variable in the code is that expression a standard error? 3. (5) Fill in the blank. The variable Y on p.131 is an example of what is generally called a/an variable. 4. (10) Consider Equation (4.16), p.122. In each of the entries in the table below, fill in either R for random, or NR for nonrandom: rr quantity & R or NR? & & & & (That quantity on the left, second line, is , ``W-bar,'' not so clear in the table.) 5. (10) Consider , the estimator of a population proportion p, based on a sample of size n. Give the expression for the standard error of . 6. (15) The term random sample means with replacement. If it is without replacement, it is called a simple random sample. Suppose we take a simple random sample of size 2 from a population consisting of just three values, 66, 67 and 69. Let denote the resulting sample mean. Find . 7. (15) Suppose we have a random sample , and we wish to estimate the population mean , as usual. But we decide to place double weight on , so our estimator for is Find E(U) and Var(U) in terms of and the population variance . Do reasonable algebraic simplification. Solutions: 1. (ii) 2a. numruns 2b. s/sqrt(nreps) 3. indicator 4. R, R, NR, NR 5. 6. 7. , ","course":"ECS132"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. SHOW YOUR WORK! 1. (20) In the example of the population of three people, p.161, find the probability that overestimates the population mean . Give your answer in a common fraction, simplified to the lowest terms. 2. (20) State the mailing tube used in line 14 of the code on p.166, citing a specific equation number. 3. (30) The following code finds the true confidence level for a certain kind of confidence interval. Fill in the gaps. Note: The optional argument prob in sample() gives sampling probabilities. If for instance we wish to simulate one roll of a trick die that has probability 0.25 of a 6 and probabilities 0.15 each for 1,2,3,4,5, the call would be sample(1:6,1,prob=c(0.15,0.15,0.15,0.15,0.15,0.25). [fontsize=-2] simpci <- function(m,k,r) contain <- vector(length=m) for (i in 1:m) samp <- sample( ) # gap 1 center <- # gap 2 rad <- 1.96 * sqrt(center*(1-center)/k) contain[i] <- # gap 3 print(mean(contain)) 4. (30) Here you will use DES.R to simulate a mirrored file server. There are two disks with identical contents, assumed here to be read-only. Read requests arrive to the server according to a Poisson process with intensity parameter v. The track number of a request is uniformly distributed on (0,1). The read/write head of a disk travels at speed s during seeks, so that the time to traverse all tracks is 1/s. After a seek, if no other requests are pending, the read/write head of a disk stays at the track of its last seek. For simplicity, assume that when a request is fulfilled by a disk and other requests are waiting, this disk will be the one to serve the next request, even if the other disk is idle and its current track is closer. Fill in the gaps. Note: rep() does a ``repeat'' operation, so that for instance rep(5,3) is (5,5,5). [fontsize=-2] dsim <- function(totsimetime,v,s) initglbls(v,s) ... return(totmovtime/(2*totsimetime)) initglbls <- function(v,s) arvrate <<- v # arrival rate seekspd <<- s # seek speed # last known positions of the 2 disks, in (0,1) lastpos <<- rep(0.0,2) # destinations of the 2 disks (if moving), in (0,1) nextpos <<- rep(0.0,2) # booleans for currently idle idle <<- rep(TRUE,2) # start times of the current moving times of the 2 disks # (if moving), changing from an idle state startmov <<- rep(0.0,2) # number of waiting requests q <<- 0 arvtype <<- 1 # arrival type seektype <<- 2 # seek done type totmovtime <<- 0 reactevnt <- function(head) if (head[2] == arvtype) # arrival ... else # seek done disknum <- head[3] lastpos[disknum] <<- nextpos[disknum] if (q > 0) nextpos[disknum] <<- # gap 1 q <<- # gap 2 seektime <- # gap 3 seekdonetime <- sim = (69+70+72)/3 = 70.33 X currtime + seektime schedevnt(seekdonetime,seekdonetype,disknum) # generate next arrival arvtime <- sim currtime else # seek done disknum <- head[3] lastpos[disknum] <<- nextpos[disknum] if (q > 0) nextpos[disknum] <<- runif(1) q <<- q - 1 seektime <- abs(nextpos[disknum]-lastpos[disknum]) / seekspd seekdonetime <- sim currtime - startmov[disknum] idle[disknum] <<- TRUE ","course":"ECS132"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), pnorm(), etc. 1. Consider the class enrollment size example, starting on p.97. Suppose the distribution of enrollment size is Poisson, rather than approximate normal. Assume the mean is again 28.8. [(a)] (20) Find Var(N). [(b)] (20) Find . [(c)] (15) Find . 2. Consider the network intrusion example on p.97. [(a)] (15) Let . Name the parametric family of densities that Y's density belongs to, including the parameter values, if any. [(b)] (15) Let G denote the indicator random variable for the event . Find Var(G). 3. (15) Suppose R didn't include the sample() function. We could use the code below instead. Here's an example of usage: > x <- samp(c(1,6,8),1000,c(0.2,0.5,0.3)) > sum(x==1) [1] 224 > sum(x==6) [1] 495 > sum(x==8) [1] 281 Here we generate 1000 numbers from 1,2,3, with probability 0.2, 0.5 and 0.2, respectively, and then count the numbers of 1s, 6s and 8s we get. The built-in R function cumsum() finds cumulative sums, e.g. > cumsum(c(3,8,1)) [1] 3 11 12 [numbers=left] # what we'd do if there were no sample() # ftn in R; sample n items (with # replacement) from the vector nums, with # probabilities given by prob; # the vectors nums and prob must have # the same length (not checked here); # not claimed efficient samp <- function(nums,n,prob) samps <- vector(length=n) cumulprob <- cumsum(prob) for (i in 1:n) samps[i] <- sample_one_item(nums,cumulprob) return(samps) sample_one_item <- function(nums,cumulprob) u <- runif(1) lc <- length(cumulprob) for (j in 1:(lc-1)) if (BLANKa) BLANKb BLANKc Solutions: 1.a Since N has a Poisson distribution, Var(N) = E(N) = 28.8. 1.b For a Poisson random variable M, , so answer is ppois(26,28.8) 1.c (4.50) still holds, and evaluates to [numbers=left] (1 - ppois(29,28.8)) / (1 - ppois(24,28.8)) 2.a Chi-square, 1 degree of freedom. 2.b From Section 3.6: 3. [numbers=left] # what we'd do if there were no sample() ftn in R; sample n items (with # replacement) from the vector nums, with probabilities given by prob; # the vectors nums and prob must have the same length (not checked here); # not claimed efficient samp <- function(nums,n,prob) samps <- vector(length=n) cumulprob <- cumsum(prob) for (i in 1:n) samps[i] <- sample_one_item(nums,cumulprob) return(samps) sample_one_item <- function(nums,cumulprob) u <- runif(1) lc <- length(cumulprob) for (j in 1:(lc-1)) if (u < cumulprob[j]) return(nums[j]) return(nums[lc]) ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), etc. 1. (15) A class has 68 students, 48 of which are CS majors. The 68 students will be randomly assigned to groups of 4. Find the probability that a random group of 4 has exactly 2 CS majors. 2. This problem concerns the bus ridership example, Sec. 2.11 in our book. [(a)] (15) Find . [(b)] (15) Find . 3. This problem again concerns the bus ridership example, but focuses on the simulation, Sec. 2.12.4. Here we are interested in finding . [(a)] (10) Where should a line tot_l2 <- 0 be placed? Answer using a half-line number, e.g. 6.5 if you think this code should be inserted between lines 6 and 7. [(b)] (15) What code should be inserted at line 12.5? [(c)] (10) Give a print statement to go after line 16, printing the approximate value of 4. (10) Say a large research program measures boys' heights at age 10 and age 15. Call the two heights X and Y. So, each boy has an X and a Y. Each boy is a ``notebook line'', and the notebook has two columns, for X and Y. We are interested in Var(Y-X). Which of the following is true? (Answer with a Roman numeral, e.g. (v).) [(i)] [(ii)] [(iii)] [(iv)] [(v)] [(vi)] [(vii)] None of the above. 5. (10) Suppose at some public library, patrons return books exactly 7 days after borrowing them, never early or late. However, they are allowed to return their books to another branch, rather than the branch where they borrowed their books. In that situation, it takes 9 days for a book to return to its proper library, as opposed to the normal 7. Suppose 50 of patrons return their books to a ``foreign'' library. Find Var(T), where T is the time, either 7 or 9 days, for a book to come back to its proper location. (Hint: Use the concept of indicator random variables.) Solutions: 1. 2.a 2.b First, note that ; then compute . 3.a 3.5 (or earlier) 3.b if(j == 8) tot_l2 <- tot_l2 + passengers 3.c print(tot_l2 / nreps) 4. Since X and Y are positively correlated, their covariance is positive, so the answer is (iii). 5. , where I is an indicator random variable for the event that the book is returned to a ``foreign'' branch. Then ","course":"ECS132"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. Unless otherwise stated, give numerical answers as expressions, e.g. . Do NOT use calculators. 1. (35) Consider the lottery ticket example on pp.147ff. Suppose 500 tickets are sold, and you have data on 8 of them. Continue to assume sampling with replacement. Fill in the blank: The probability that the MLE is exactly equal to the true value of c is 1 - . 2. In an analysis published on the Web (Sparks et al, Disease Progress over Time, The Plant Health Instructor, 2008, the following R output is presented: [fontsize=-2] > severity.lm <- lm(diseasesev temperature,data=severity) > summary(severity.lm) Coefficients: Estimate Std. Error t value Pr(>t) (Intercept) 2.66233 1.10082 2.418 0.04195 * temperature 0.24168 0.06346 3.808 0.00518 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Fill in the blanks: [(a)] (15) The model here is = + [(b)] (15) The two null hypotheses being tested here are and . 3. Let X denote the number we obtain when we roll a single die once. Let denote the generating function of X. [(a)] (20) Find . [(b)] (15) Suppose we roll the die 5 times, and let T denote the total number of dots we get from the 5 rolls. Find . Solutions: 1. 2a. mean, diseasesev, temperature (the word mean is crucial) 2b. , 3a. 3b. Write . Then ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), integrate()etc. 1. This problem concerns the dice game example, Section 7.3.5 of our book. In writing R code, assume that the matrix (7.60) is already stored in a matrix named v. And there is good news! Players now win 8 each time they roll a four, five or six. Let , and represent how much a player wins in all her rolls that come up 1 dot, 2 or 3 dots, or 4, 5 or 6 dots, respectively; for example, . Denote the (column) vector consisting of , and by U. Find the following quantities. Unless specifically allowed, do not use loops, + or sum(). Do not make corrections for continuity. [(a)] (10) [(b)] (10) [(c)] (10) (exact) [(d)] (10) (exact) [(e)] (15) (approximate) [(f)] (15) where [(g)] (10) 2. (20) Here you will write code to help Justin conduct his opinion poll on Amanda's chances of winning the election. It will be an e-mail poll. Assume (as will actually be the case when my grading script runs) that we have the following global variables: voters, a data frame containing information on all the registered voters in Davis, one voter per row; emailcol, the column number in which the voters' e-mail addresses are stored; and n, the number of people to sample. The code will display a simple random sample of e-mail addresses. Single line of code (semicolons OK), no loops. Solutions: 1.a 1.b 1.c dbinom(12,50,2/6) 1.d pbinom(12,50,2/6) 1.e pnorm(12,50*2/6,sqrt(50*(2/6)*(4/6))) 1.f a <- rbind(c(1,-1,0),c(0,1,1)); a 1.g a <- matrix(0,nrow=3,ncol=3); diag(a) <- c(5,2,8); a 2. polled <- sample(1:nrow(voters),n,replace=F); voters[polled,emailcol] ","course":"ECS132"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. SHOW YOUR WORK! 0. Advice: The first three problems have half-line answers and should be very quick. Most of the other problems also have very short answers, but may or may not be quick. 1. (15) Fill in the blank with a term from our course: In comparing two estimators of some population quantity, we might consider the better one to be the one with smaller . 2. (15) Consider the R code on p.262, which consists of assignments to md and lmout. Suppose we wish to fit a model with no first-degree term, i.e. (9.12) would change to How should we change the code in the line on p.262 that assigns to lmout? Assume that the line assigning to md is retained. 3. (10) Consider the R code at the top of p.200. Give an approximate 95 confidence interval for the population value . Your answer will be in the form , where c and d are numerical expressions, e.g. . 4. Suppose the random variable X is equal to 1, 2 and 3, with probabilities c, c and 1-2c. The value c is thus a population parameter. We have a random sample from this population. [(a)] (15) Show that the Method of Moments Estimator of c, which we will denote by , is . [(b)] (15) Find the bias of . Cite mailing tubes and other reasons! 5. In the notation of Chapter 9, give matrix and/or vector expressions for each of the following in the linear regression model: [(a)] (10) , our estimator of [(b)] (10) the standard error of the estimated value of the regression function at , where . 6. (10) Suppose Jack and Jill each collect random samples of size n from a population having unknown mean but KNOWN variance . They each form an approximate 95 confidence interval for , using (6.21) but with s replaced by . Find the approximate probability that their intervals do not overlap. Express your answer in terms of , the cdf of the N(0,1) distribution. Solutions: 1. Mean squared error. Partial credit was given for some other answers, but it was emphasized that MSE is the major criterion, as it balances variance and bias. 2. [fontsize=-2] lm(md[,2] md[,3]) 3. 4a. , so c = (3-EX)/3. So, set . 4b. So, the bias is 0. 5a. Note that hats! 5b. so Then (5.88) yields Thus the sample estimated variance is so that the standard error is 6. The probability of nonoverlap is double the probability that Jack's interval is entirely to the left of Jill's. Since each interval has radius , Jack's interval will be entirely to the left of Jill's if we have So, we need the distribution of . But and by independence so W has standard deviation . Thus You might be surprised to see that the answer is independent of n. The actual value is about 0.16. So, Jack and Jill have about a 16 chance of having nonoverlapping intervals. ","course":"ECS132"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), pnorm(), etc. Note too the R function integrate(), e.g. > integrate(function(x) x^2,0,1) 0.3333333 with absolute error < 3.7e-15 The limits of integration must be numbers or Inf or -Inf, not symbols. Thus one cannot use it for the inner integral in a double integral. 1. Say X and Y have means 1 and 2, with variances 4 and 8, and with covariance -1. Find the following: [(a)] (20) Var(X+Y) [(b)] (20) [(c)] (15) Cov(X,X+Y) 2. Find the cdf values: [(a)] (15) In the marbles problem, pp.156-158, find . [(b)] (20) In the example in Sec. 8.2.3, find . 3. (10) In Sec. 7.3.5, find Var(X - 2Y + Z), using matrix methods. Note: Recall that in R, matrix multiplication is done via [numbers=left] > m <- matrix(c(5,12,13,3,4,5),ncol=2) > m [,1] [,2] [1,] 5 3 [2,] 12 4 [3,] 13 5 Solutions: 1.a 1.b 1.c 2.a 0.002 + 0.024 + 0.162 + 0.073 2.b 3. [numbers=left] 50 * c(1,-2,1) ","course":"ECS132"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (25) Consider the simple board game, pp.15ff, starting at 0. Change the game so that it has 16 squares, numbered 0 to 15, but is otherwise identical to the original one. Let X denote the square the player lands on after the first turn. Find E(X), expressing your answer as a sum of fractions, e.g. 3/2 + 1/5 (-3/8). 2. (25) Supppose we have a random variable X, and define a new random variable Y, which is equal to X if X 8 and equal to 0 otherwise. Assume X takes on only a finite number of values (just a mathematical nicety, not really an issue). Which one of the following is true: [(i)] . [(ii)] . [(iii)] Either of and could be larger than the other, depending on the situation. [(iv)] is undefined. 3. This problem concerns the binary tree model in our homework. [(a)] (25) Find the probability that the root has exactly 1 grandchild, expressing your answer in terms of p, algebraically simplified. [(b)] (25) Fill in the blanks in the following code simulating the function r(k,p): sim1tree <- function(k,p) if (k == 0) return( ) # blank prevlevelbranches <- 1 for (m in 1: ) # blank newbranches <- 0 for (i in 1: ) # blank for (j in 1:2) if (sample(0:1,1,prob=c(1-p,p)) == 1) newbranches <- newbranches + 1 if (newbranches == 0) return(0) # blank return(1) treesim <- function(p,k,nreps) count <- 0 for (i in 1:nreps) treetok <- sim1tree(k,p) count <- count + treetok return( ) # blank Solutions: 1. X can take on the values 1 through 9. Then , , etc. 2. Answer (iii) is correct. If we'd had the additional condition , then (i) would have been right. But without that condition, then for instance suppose X were always negative; then Y would always be larger, etc. 3.a The root will have exactly one grandchild iff it has two children, and one of them has one child and the other has none. Thus the queried probability is 3.b sim1tree <- function(p,k) if (k == 0) return(1) prevlevelbranches <- 1 for (m in 1:k) # levels newbranches <- 0 for (i in 1:prevlevelbranches) for (j in 1:2) # account for left, right outlinks if (sample(0:1,1,prob=c(1-p,p)) == 1) newbranches <- newbranches + 1 if (newbranches == 0) return(0) prevlevelbranches <- newbranches return(1) treesim <- function(p,k,nreps) count <- 0 for (i in 1:nreps) treetok <- sim1tree(p,k) count <- count + treetok return(count/nreps) ","course":"ECS132"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (20) In (2.30) and (2.32), cite ``mailing tubes'' for each, in the form of equation numbers that were used. 2. (40) This problem concerns the ALOHA network example. Let denote the event that Node A attempts to transmit during epoch i, i = 1,2,... and define similarly for Node B. In each case below, you are given two events, in the first two columns of the table. Write in the third column either I, for independent, D, for disjoint, or N, for neither. rrr & & & & & & and & and not & 3. (40) In the simple board game on pp.15ff, let denote your position after your i turn, i = 1,2,... Find and , giving your answers as fractional expressions, e.g. (1+2/3) / (2 + 1/2). Solutions: 1. (2.2), (2.5) 2. rrr & & I & & N & & N and & and not & D 3. Let denote your i ordinary roll, with being the bonus you get for roll i. * Get by reasoning it out. After our first turn, if we are at square 2, the only way to be at that square after the next turn is to first roll a 1, getting us to the bonus square 3, and then roll a 6 for our bonus. The probability is thus (1/6) (1/6). ","course":"ECS132"} {"quiz":" Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. Consider the ALOHA simulation on p.47. [(a)] (20) On what line do we simulate the possible creation of a new message? [(b)] (20) Change line 10 so that it uses sample() instead of runif(). 2. (20) Say we roll two dice, a blue one and a yellow one. Let B and Y denote the number of dots we get, respectively. Now let G denote the indicator random variable for the event S = 2. Find E(G). 3. Suppose are independent indicator random variables, with , j = 1,2,3. Find the following in terms of the , writing your derivation in ``stacked equation'' form [as for example in (3.53)-(3.55)], with reasons in the form of mailing tube numbers. You should do reasonable algebraic simplfication of your expressions. Let . [(a)] (20) ES [(b)] (20) Var(S) Solutions: 1.a 14 1.b [fontsize=-2] numsend <- numsend + sample(0:1,1,prob=c(p,1-p)) 2. EG = P(G = 1) = P(B+Y = 1) = 1/36 3.a 3.b Let denote the event associated with , j = 1,2,3, and let denote the event that and both occur. Then is the indicator random variable for . Thus ","course":"ECS132"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (20) Consider the ALOHA example, Sec. 3.14.3. Write a call to the built-in R function dbinom() to evaluate (3.123) for general m and p. 2. Consider the bus ridership example, Sec. 2.10. Suppose upon arrival to a certain stop, there are 2 passengers. Let A denote the number of them who choose to alight at that stop. [(a)] (10) State the parametric family that the distribution of A belongs to. [(b)] (20) Find and , writing each answer in decimal expression form e.g. . 3. (20) Consider the following simple inventory model. A store has 1 or 2 customers for a certain item each day, with probabilities p and q (p+1 = 1). Each customer is allowed to buy only 1 item. When the stock on hand reaches 0 on a day, it is replenished to r items immediately after the store closes that day. If at the start of a day the stock is only 1 item and 2 customers wish to buy the item, only one customer will complete the purchase, and the other customer will leave emptyhanded. Let be the stock on hand at the end of day n (after replenishment, if any). Then form a Markov chain, with state space 1,2,...,r. Write a function inventory(p,q,r) that returns the vector for this Markov chain. It will call findpi1(), similarly to the two code snippets in p.65. Solutions: 1. [numbers=left] dbinom(1,m,p) 2.a binomial 2.b 3. inventory <- function(p,q,r) tm <- matrix(rep(0,r^2),nrow=r) for (i in 3:r) tm[i,i-1] <- p tm[i,i-2] <- q tm[2,1] <- p tm[2,r] <- q tm[1,r] <- 1 return(findpi1(tm)) ","course":"ECS132"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. Consider the 2t/15 example, Sec. 4.3.3. Suppose this is the density of light bulb lifetimes L (on the time scale of years). Note: In all parts below, give each answer as decimal expression, e.g. , or as a common fraction reduced to lowest terms. You may cite equations in that section. [(a)] (20) Find the proportion of bulbs with lifetime less than the mean lifetime. [(b)] (20) Find E(1/L). [(c)] (20) If I test many bulbs, on average how long will it take to find two that have lifetimes longer than 2.5? [(d)] (20) Suppose I've been using bulb A for 2.5 years now in a certain lamp, and am continuing to use it. But at this time I put a new bulb, B, in a second lamp. I am curious as to which bulb is more likely to burn out within the next six months. Find the two probabilities. 2. (20) The expected value of a chi-square random variable with k degrees of freedom turns out to be k. Derive this fact in a step-by-step manner, citing mailing tubes, and NOT using material past p.94. Solutions: 1.a 1.b 1.c Use (3.110) with k = 2 and p = 0.65. 1.d First find the probability of NOT burning out in the next six months. For bulb A, use (6.3), yielding 2. Let Y be as in (4.55). Then ","course":"ECS132"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. Consider once again the Jack and Jill example, Sec. 3.14.1. Write you answers as a decimal expression, e.g. . [(a)] (20) Find . [(b)] (15) Find . 2. Consider the Ethernet example, Sec. 5.5.4. Answer the following with either decimal expressions (see above) or integrals. The latter must be definite integrals that calculus students could evaluate to actual numbers. [(a)] (10) Find Var(X). [(b)] (10) Find . [(c)] (15) Find . [(d)] (15) (BONUS PROBLEM: Points plus Extra Credit) Suppose transmission time T for a message is also random, exponentially distributed with mean 0.1. Find . 3. (15) Fill in the following R code to find in Sec. 4.4.7.2: xpois(blank2, blank3) The \"x\" here is blank1. Solutions: 1.a 1.b 2.a Since , then (p.95). 2.b . 2.c 0 2.d Here T will be the transmission time for X. We need to find . Following the computation in Sec. 5.5.6, we have So, similar to the reasoning in (5.87), we have 3. Use Sec. 4.4.6.1. ppois(4,5.52) ","course":"ECS132"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. 1. (20) Fill in the blank with a term from our course: The value on the left-hand side of (6.1) turns out not to depend on t, in the case of W being exponentially distributed. We say that the W has a constant function. 2. Suppose , and are independent, each with distribution U(0,1). [(a)] (20) Write (but do not evaluate) an integral for . In parts (b) and (c), suppose we're interested in finding , using (5.107). [(b)] (20) Show the matrix A. [(c)] (20) Show the matrix Cov(W). 3. (20) Suppose Pei and Gowtham each take random samples of size 2 with replacement from the three-person population in the toy example on p.185. Find the probability that Gowtham's sample mean is exactly equal to Pei's. EXPRESS YOUR ANSWER AS A SINGLE FRACTION, REDUCED TO LOWEST TERMS, but show your work! Solutions: 1. hazard 2.a 2.b 2.c A U(0,1) distribution has variance 1/12 and the are independent. So the covariance matrix is diagonal, with all diagonal elements equal to 1/12. 3. ","course":"ECS132"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), pnorm(), etc. Note too the R function integrate(), e.g. > integrate(function(x) x^2,0,1) 0.3333333 with absolute error < 3.7e-15 The limits of integration must be numbers or Inf or -Inf, not symbols. Thus one cannot use it for the inner integral in a double integral. 1. Consider the computer worm example in Sec. 8.3. Say g = 5. [(a)] (25) Find the probability that the time spent at state 2 (before going to state 3) is greater than 0.10. [(b)] (15) Let denote the time needed to go from state 3 to state 5. Find . 2. In each of the following, state---using mathematical symbols, e.g. E(), P(), F, f, , names of the variables used in the example (don't make up your own names), etc.---what the integral is calculating. Use an underscore for subscripts in your Answers file, e.g. write U_1 for . [(a)] (0) in Section 8.3.4. [(b)] (25) in Section 8.2.4. [(c)] (20) , where we have independent random variables X and Y with density on . 3. (15) Suppose X and Y be independent and each have a uniform distribution on the interval (0,1). Let Z = min(X,Y). Find . Solutions: 1.a The text says that the time spent at state 2 is exponentially distributed with parameter 2x3 = 6. So, the probability is 1-pexp(0.10,6 1.b Have sum of two independent exponentials, but with different values---just like the backup battery example. So, is equal to exp(-0.3/6) + exp(-0.3/4 (where and ). 2.a (error in original problem statement) 2.b E(XY) 2.c 3. Using the same derivation as on p.168, So, and . 4. This is the density of X+Y, where X and Y are independent with the given density. ","course":"ECS132"} {"quiz":" Name: Directions: Work only on this sheet (on both sides, if needed); do not turn in any supplementary sheets of paper. There is actually plenty of room for your answers, as long as you organize yourself BEFORE starting writing. AT THE END OF THE EXAM: E-mail me your code, in a single file named ID1.ID2...R, where the ID values are the student ID numbers of the members of your group (only those present, of course). Suppose we sample q people at random from a population consisting of m individuals, numbered 1,...,m. There are three subpopulations: Those numbered 1,...,c; those numbered c+1,...,c+d, and those numbered c+d+1,...,m. Let X, Y and Z denote the numbers of people who fall into the three subpopulations. [(a)] () Suppose the sampling is with replacement. Find the exact value of . Express your answer as an R function, rplcsamp(m,q,c,d,i,j,k). [(b)] () Same as (a), except that sampling is without replacement. Your R function will be norplcsamp(m,q,c,d,i,j,k). [(c)] () Same as (b), except that the probability is found via simulation. The call is simnorplcsamp(m,q,c,d,i,j,k,nreps), with a default value of 10000 for nreps. Do NOT include any error-checking code. ","course":"ECS132"} {"quiz":"Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (20) The formal term used when two events cannot occur ``in the same notebook line'' is that they are . 2. Consider the simple board game in Section 2.10. Let X denote your position after your first turn. [(a)] (20) Find P(X = 1). [(b)] (20) Find P(R = 1 X = 1). 3. Consider the ALOHA example, same as in the book, except that both nodes start out inactive, i.e. . Assume p = 0.6, q = 0.2. [(a)] (20) Find the probability that there is a collision in first epoch. [(b)] (20) Find . Solutions: 1. disjoint 2.a 2.b 3.a How can it happen? A collision will occur in the first epoch if and only both nodes develop messages and both try to send, which has probability 3.b How can it happen? will occur if and only if both nodes go active and either both send or both refrain from sending. the probability of that is ","course":"ECS132"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), etc. 1. Suppose X has the density on , 0 elsewhere. Note: This problem is numerical (as are most of our Quiz problems), so it requires R expressions as answers. That R expression must evaluate to a number. You may use integrate() if you know how, but it's easier just to do the integration yourself. [(a)] (15) Find . [(b)] (20) Find EX. 2. Consider the Markov model of bus ridership, pp.87ff. [(a)] (15) Find . [(b)] (10) Suppose we wish to find the long-run average number of passengers that alight from the bus, per stop. This will be Give the value of . 3. (20) Suppose in modeling disk performance, we describe the position X of the read/write head as a number between 0 and 1, representing the innermost and outermost tracks, respectively. Say we assume X has a uniform distribution on (0,1). Consider two consecutive positions (i.e. due to two consecutive seeks), and , which we'll assume are independent. Find . 4. (20) Consider the network intrusion model, pp.104-105. Assume there is never an intrusion, i.e. all logins are from Jill herself. Say we've set our network intrusion monitor to notify us every time Jill logs in and accesses 535 or more disk sectors. In what proportion of all such notifications will Jill have accessed at least 545 sectors? Solutions: 1.a 1.b 2.a 2.b 0.2 3. 1/12 + 1/2 4. This is . By an analysis similar to that in Section 5.5.2.3, this probability is (1 - pnorm(545,500,15) / (1 - pnorm(535,500,15)) ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), integrate()etc. 1. (15) Consider the marble example, Section 11.5. Find Var(Y B = 2). 2. (15) Suppose in Equation (8.22) I wish to form an 88 confidence interval, instead of a 95 one. Give an expression, which must involve a call to one of the R functions we've used, to calculate the number I'll use instead of 1.96. 3. Consider the ``new, improved light bulbs'' example in Section 9.6.2. Note: Each of the parts here is independent of the others. [(a)] (10) If we wished to have significance level , sampling 50 bulbs, what should be our threshhold for rejection, like the w in the example? [(b)] (15) Suppose we have 15 people test batches of 10 light bulbs, each performing a significance test as in the example. Suppose also that actually is true. Find the probability that at least 3 of the people reject . [(c)] (15) Suppose it turns out that . Find the p-value. 4. In the baseball data, Section 11.7, I wanted to run separate regression analyses for catchers and starting pitchers. [(a)] (15) I extracted the two subsets of my original data frame players, naming them catch and pitch. Give one line of R code that creates catch. [(b)] (15) I ran regressions of weight on height in the two groups, with these results: [fontsize=-2] > summary(lm(catchHeight)) Call: lm(formula = catchHeight) Residuals: Min 1Q Median 3Q Max -31.505 -7.603 -1.603 8.495 31.789 Coefficients: Estimate Std. Error t value Pr(>t) (Intercept) -79.4301 67.9087 -1.17 0.246 catch Weight pitch Weight pitch Height 4.4407 0.5943 7.472 1.89e-12 *** --- Find an approximate 95 confidence interval for the difference (catchers minus pitchers) between the slopes for the Height variables for the two groups. Solutions: 1. 2. -qnorm(0.06) 3.a qgamma(0.90,50,0.001) / 50 3.b 1 - pbinom(2,15,0.05) 3.c 1 - pgamma(16242,10,0.001) 4.a catch <- subset(players,Position == \"Catcher\") or catch <- players[players ","course":"ECS132"} {"quiz":" xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Do NOT turn in this sheet of paper (unless you lack a laptop or have a laptop failure during the Exam). You will submit electronic files to handin. INSTRUCTIONS FOR SUBMISSION: Submit to the CSIF handin, under my account, using the alphabetically earliest UCD e-mail address among your group members. Submit ONLY the files Problem1.tex and Problem2.R. 1. (50) Suppose on , 0 elsewhere. Find for the case . Tip: Find first. Submit your derivation in a LaTeX file Problem1.tex. My grading script will check it by running 2. (50) Lifetimes of some electronic component formerly had an exponential distribution with mean 100.0. However, it's claimed that now the mean has increased. (Suppose we are somehow sure it has not decreased.) Someone has tested 50 of these new components, and has recorded their lifetimes, . Unfortunately, they only reported to us the range of the data, . We will need to do a significance test with this limited data, at the 0.05 level. Note (p.222) that it will necessarily be a bit different from 0.05. Take the one that is nearest but no larger than 0.05. You may wish to use the R ceiling() function here. Use simulation (because the problem is too difficult mathematically) to find a cutoff value v for our significance test, and state whether we reject if or . Submit your full code in a file Problem2.R. My grading script will check it by running > source(\"Problem2.R\") and your code will print out something like ``reject if R 202.8.'' Solutions: 1. Since X and Y are not independent, we cannot use convolution. So . 2. [numbers=left] # random sample of size 50, test H0: mu = 100.0, # against HA: mu > 100.0, exponential distribution; # just have range R # code to determine the cutoff point for significance # at 0.05 level nreps <- 200000 n <- 50 rvec <- vector(length=nreps) for (i in 1:nreps) x <- rexp(n,0.01) rng <- range(x) rvec[i] <- rng[2] - rng[1] rvec <- sort(rvec) cutoff <- rvec[ceiling(0.95*nreps)] cat(\"reject H0 if R >\",rvec[cutoff],\"\") # check (not requested): tvec <- vector(length=nreps) for (i in 1:nreps) x <- rexp(n,0.01) rng <- range(x) rej <- (rng[2] - rng[1]) > cutoff tvec[i] <- rej print(mean(tvec)) # should be near 0.05 ","course":"ECS132"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), etc. 1. Consider the train rendezvous problem in Section 8.2.4. In each of the following, give your answer as a double integral, dt ds. In your electronic answers file, give your answer as five R expressions separated by commas (and optional spaces), as follows: lower limit in outer integral; upper limit in outer integral; lower limit in inner integral; upper limit in inner integral; integrand. For instance, for write 3, 8, s, 1-s, (s+t)^2 [(a)] (15) Find . [(b)] (20 Find . 2. Suppose the random vector has mean vector (1.5,2.0)'. Suppose also that each has variance 4 and that . [(a)] (15) Find , as an R matrix expression. [(b)] (15) Find , as an R matrix expression. 3. The function below computes the correlation matrix corresponding to a given covariance matrix. Element (i,j) of the former is the correlation between the i and the j elements of the given random vector. covtocorr <- function(covmat) n <- nrow(covmat) stddev <- vector(length=n) cormat <- matrix(nrow=n,ncol=n) for (i in 1:n) stddev[i] <- blank (a) cormat[i,i] <- blank (b) for (i in 1:(n-1)) for (j in (i+1):n) tmp <- blank (c) cormat[i,j] <- tmp cormat[j,i] <- tmp return(cormat) [(a)] (5) Fill in blank (a). [(b)] (5) Fill in blank (b). [(c)] (15) Fill in blank (c). 4. (15) Suppose we have three electronic parts, with independent lifetimes that are exponentially distributed with mean 2.5. They are installed simultaneously. Find the mean time until the last failure occurs. Solutions: 1.a 0.8, 1, 1.8-s, 1, 2-s-t 1.a 0, 0.4, 0, 0.3, 2-s-t 2.a c(2,3) 2.b c(1,1) 3. covtocorr <- function(covmat) n <- nrow(covmat) stddev <- vector(length=n) cormat <- matrix(nrow=n,ncol=n) for (i in 1:n) stddev[i] <- sqrt(covmat[i,i]) cormat[i,i] <- 1.0 for (i in 1:(n-1)) for (j in (i+1):n) tmp <- covmat[i,j] / (stddev[i]*stddev[j]) cormat[i,j] <- tmp cormat[j,i] <- tmp return(cormat) 4. As in the computer worm example in Section 8.3.8, the mean is 1/(3*0.4) + 1/(2*0.4) + 1/(1*0.4) ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), etc. 1. Consider the simulation on p.62. [(a)] (15) Suppose line 7 were replaced by mean(dvals^2) - (mean(dvals))^2 (Recall that the return() is not ncessary.) What would replace ``ED'' in the comment in line 6? Your answer should reflect the likely goal of the programmer, and you must use official notation from our course (which is fairly standard in probability books); ED is an example, as is P(D = 3) and so on. [(b)] (10) Suppose line 7 were replaced instead by mean(dvals > 18) What would replace ``ED'' in line 6 in this case? 2. This problem involves the parking space example, Sec. 3.12.3.2. [(a)] (15) Find P(D = 12). [(b)] (10) Find Var(N). [(c)] (10) Good news! I found a parking place just one space away from the destination. Find the probability that I am parked in the same block as the destination. 3. (10) We have two vectors, x and y. The elements of the latter are either \"a\", \"b\" or \"c\". We want to create a new vector with the following property: For any element in y that has the value \"a\", the new vector's corresponding element will be 100, with the new value being 200 in the case of \"b\". In the case of \"c\", the element in the new vector will be the corresponding element of x. Write a single line of code that creates and prints out this new vector. You are not allowed to use loops. Example: > x [1] 5 12 13 3 4 5 > y [1] \"a\" \"c\" \"b\" \"b\" \"a\" \"a\" > # single line of code here, > # maybe with semicolons [1] 100 12 200 200 100 100 (My grading script will set global variables x and y. Your code should NOT do this.) 4. (30) The function rpmf() below generates n random values from a distribution with probability mass function pmf and support supp. The term support for a distribution is just a fancy name for the set of values a random variable with that distribution can take on. The function is is then used to find the approximate probability that in 3 consecutive stops in the bus ridership example (Sec. 2.11), a total of 2 passengers board. Fill in the blanks, without using loops. rpmf <- function(n,pmf,supp) blank (a) bvals <- rpmf(3000,c(0.5,0.4,0.1),0:2) m <- matrix( blank (b) ) # the following call sets sums[i] # to the sum of row i of m sums <- apply(m,1,sum) blank (c) Solutions: 1.a Var(D) 1.b P(D 18) 2.a 2.b 2.c 3. ifelse(y == \"b\",200,ifelse(y == \"a\",100,x)) 4. rpmf <- function(n,pmf,supp) sample(supp,n,prob=pmf,replace=T) bvals <- rpmf(3000,c(0.5,0.4,0.1),0:2) m <- matrix(bvals,ncol=3) # the following call sets sums[i] # to the sum of row i of m sums <- apply(m,1,sum) mean(sums == 2) ","course":"ECS132"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), etc. 1. The following R function forms a confidence interval for a population proportion, based on the sample in the vector x, of approximate confidence level conflevel. pci <- function(x,conflevel) n <- __________(x) # blank (a) p_hat <- __________(x) # blank (b) stderr <- __________ # blank (c) multiplier <- _________ # blank (d) # in R, last value computed is returned c(__________) # blank (e) For instance, if our sample data is 1, 0, 0, and 1, and we want an approximate 95 confidence level, the call would be pci(c(1,0,0,1), 0.95) [(a)] (10) Fill in blank (a). [(b)] (10) Fill in blank (b). [(c)] (10) Fill in blank (c). [(d)] (10) Fill in blank (d). [(e)] (10) Fill in blank (e). 2. There is the concept of the power of a hypothesis test, defined to be the probability of rejecting in a circumstance in which is false. For instance, consider the coin example in pp.221ff. The power of the test at 0.55 is defined to be the probability that we reject if the true value of p is 0.55. [(a)] (15) Power plays a big role in theoretical statistics, where theorems are proved for things such as Uniformly Most Powerful tests. But in practice, we may wish the power to be low, not high, in some settings. Fill in the blank: In the coin problem, for example, we may wish to have low power in the setting in which p . [(b)] (10) Consider the light bulb lifetime problem in pp.226-227. Find the power of the test for the case . 3. (15) Suppose we have two population values to estimate, and , and that we are also interested in the quantity . We'll estimate the latter with . Suppose the standard errors of and turn out to be 3.2 and 8.8, respectively. Find the standard error of . 4. (10) A news report tells us that in a poll, 54 of those polled supported Candidate A, with a 2.2 margin of error. Assuming that a 95 level of confidence was used, find the approximate number polled. Solutions: 1.a-e pci <- function(x,conflevel) n <- length(x) p_hat <- mean(x) stderr <- sqrt(phat * (1-phat) / n) multiplier <- qnorm(0.5 + conflevel / 2) # in R, last value computed is returned c(phat - multiplier*stderr, phat + multiplier*stderr) 2.a is near 0.5 2.b 1 - pgamma(15705.22,10,1/1250) 3. 4. 0.54 * 0.46 / (0.022/1.96)^2 ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (20) Fill in the blanks: One can often get good speed with R code by using vectorization. The reason R is slow without using such devices is that it is a(n) blank (a) language. The basic programming construct that is of particular concern in terms of slowness is blank (b). (Note: Examples of programming constructs are functions, loops, if-else, arrays, strings etc., NOT types of programming, e.g., simulation.) 2. (10) Consider Equation (2.60), p.21, numerator. Which ``mailing tube'' justifies the fact that 0.4 multiplies something? Cite a specific equation number among mailing tubes. 3. (20) Consider the ALOHA network model. Say we have two such networks, A and B. In A, the network typically is used for keyboard input, such as a user typing e-mail or editing a file. But in B, users tend to do a lot of file uploading, not just typing. Fill in the blanks: In B, the model parameter blank (a) is blank (b) than in A, and in order to accommodate this, we should set the parameter blank (c) to be relatively blank (d) in B. 4. Consider the ALOHA network model with , and . We are interested in . [(a)] (10) Find that probability. (You should probably make use of existing computations, to save time.) [(b)] (10) Suppose we were to actually do the ``notebook'' process, shown in Table 2.3, p.14. We observe the network for many 2-epoch stints, 10000 of them, yielding 10000 lines in our notebook. And say we store our notebook data in an R data frame named notebook, with 10000 lines and 4 columns. (Of course, the column labeled ``notebook line'' is not stored.) We enter 1s and 0s instead of Yes's and No's. State an R expression which would give us the approximate value of . 5. (15) A famous graph model is Preferential Attachment. Think of it as a social network, with each edge representing a ``friend'' relation. (The graph is undirected, i.e. friendships are mutual.) The number of vertices grows over time, one vertex per time step. At time 0, we have just two vertices, and , with a link between them. In any graph, the degree of a vertex is its number of edges. Thus at time 0, each of the two vertices has degree 1. Whenever a new vertex is added to the graph, it randomly chooses an existing vertex to attach to, creating a new edge with that existing vertex. In making that random choice, it follows probabilities in proportion to the degrees of the existing edges. For instance (just an example!), suppose that at time 2, when is added, the adjacency matrix for the graph is Then there will be an edge created between with , or , with probability 2/4, 1/4 and 1/4, respectively. Find . 6. (15) Consider the simulation of the bus ridership model, p.26. Give a single R statement that replaces lines 8-10, by calling sample(). Solutions: 1. interpreted; for loops 2. (2.5) 3. q; larger; p; low 4a. One way is to use (2.3). From quantities already caculated in the text, this is 4b. mean(notebook[,1] + notebook[,2] > 0) 5. Let denote the node that attaches to, i = 3,4,... Then 6. passengers <- passengers - sum(sample(0:1,passengers,replace=T,prob=c(0.8,0.2))) ","course":"ECS132"} {"quiz":"amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), etc. 1. Consider the OOP study described at the top of p.281, which was actually a bit different from the description in our book: They also used logarithms, but we'll ignore that here. The results were: rrrr coef. & betahat & std.err. & 4.37 & 0.23 & 0.49 & 0.07 & 0.56 & 1.57 & -0.13 & -1.34 [(a)] (10) The last term in () is known as the term. [(b)] (20) Find the estimated difference in mean completion time under OOP and using procedure language (former minus the latter), for 1000-line programs. [(c)] (15) Find an approximate 95 confidence interval for , answering with R's c() form. [(d)] (15) Find . 2. (15) In the marbles example, p.147, find . 3. The code below estimates the regression function for scalar X, without assuming a linear or other parametric model. The vector parameters y, x, and the scalar parameter t, are self-explanatory. As to the scalar parameter h, I'll simply say that we consider one number u ``near'' another number v if . nonparregest <- function(y,x,t,h) dists <- blank (a) xnear <- blank (b) blank (c) [(5)] Fill blank (a). [(10)] Fill blank (b). [(10)] Fill blank (c). Solutions: 1.a interaction 1.b (4.37 + 0.49*1000 + 0.56*1 - 0.13*1000*1) - (4.37 + 0.49*1000 + 0.56*0 - 0.13*1000*0) 1.c c(0.49 - 1.96 * 0.07, 0.49 + 1.96 * 0.07) 1.d 0.23 ^ 2 2. (0.036*0 + 0.048*1 + 0.006*2) / (0.036 + 0.048 + 0.006) 3. nonparregest <- function(y,x,t,h) dists <- abs(x-t) xnear <- which(dists < h) mean(y[xnear]) ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. Consider Equation (3.64). [(a)] (15) List (on one line), the equation number(s) of the mailing tubes used to justify the equality . [(b)] (15) Give the equation number of the relation that justifies . 2. (15) Give the number of the mailing tube that justifies (3.80). 3. Consider the variables , p.56. [(a)] (10) Find . [(b)] (15) Find . 4. (15) Suppose and are independent random variables, with , , and . Find . 5. (15) In a certain game, Person A spins a spinner and wins dollars, with mean 10 and variance 5. Person B flips a coin. If it comes up heads, Person A must give B whatever A won, but if it comes up tails, B wins nothing. Let denote the amount B wins. Find . Solutions: 1.a (3.47), (3.40) 2. (3.32) 3.a Given the first draw resulted in a man, there will be 5 men and 3 women left, so the probability is 5/8. 3.b The requested probability is that of getting two men or two women, (6/9) (5/8) + (3/9)(3/8) 4. Use the relations (for independent U,V) and then use repeatedly: 5. Use (), in this case with X = I, where I is an indicator variable for the event that B gets a head, and Y = S. Then T = IS, and I and S are independent, so Then use the facts that I has mean 0.5 and variance 0.5(1-0.5), with S having the mean and variance given in the problem. ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: Work only on this sheet (on both sides, if needed). MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. Important note: Remember that in problems calling for R code, you are allowed to use any built-in R function, e.g. choose(), sum(), integrate()etc. 1. Consider the good ol' bus ridership examples. Except when referring to the examples in which there is a limit on the number of passengers who can fit into the bus, assume no limit. [(a)] (15) Find the probability that in 10 consecutive stops, it turns out that at exactly 3 of them there are no new passengers boarding. [(b)] (10) Find Var(T) in (3.134). (Helpful hint: and have the same distribution, thus the same variance.) [(c)] (10) In Sec. 4.5 (max 20 riders), find . [(d)] (15) Consider (4.2). The variable t there corresponds to what variable in the code in Sec. 2.12.4? (Assume the code has been modified to reflect a 20-rider limit.) [(e)] (10) In Sec. 4.5 (max 20 riders), suppose we code the transition matrix in the R matrix p. Find . Your answer must be a valid R expression that involves p; no loops. [(f)] (10) In Sec. 4.5, suppose the bus is tiny, with a capacity of only 3 passengers. Find the long-run average number of passengers who alight from the bus. Write your answer as a valid R expression in the vector, which we will assume is named pivec. Remember, pivec[1] is , etc. 2. (10) Find Var(L) in (3.118). 3. (10) Suppose X has the density on , 0 elsewhere. Find EX. You'll probably want to use the exp() function in R. Solutions: 1.a dbinom(3,10,0.5) 1.b 1.c 1.d nstops 1.e (p 1.f Mean of binomial is np. pivec[2] * (1 * 0.2) + pivec[3] * (2 * 0.2) + pivec[4] * (3 * 0.2) 2. From (3.117), 3. integrate(function(t) t^2 * exp(-t) ,0,Inf) ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. This problem concerns the parking space model, pp.66ff. Let denote the number of open spaces in the i block. [(a)] (10) Find . [(b)] (10) Give a loop-free R expression for , using one or more of the functions on p.66. [(c)] (10) Give a loop-free R expression for . [(d)] (15) Give a loop-free R expression vor . [(e)] (10) Give a loop-free R expression vor . [(f)] (15) Give a loop-free R statement to place between lines 5 and 6 in the code on p.67 that will print the approximate value of . 2. (15) Give a single, loop-free R statement to replace lines 9-10 in the ALOHA network model on p.57, making use of one of the functions introduced in Chapter 3. Think of the notebook! 3. (15) In the bus ridership model, first introduced in Sec. 2.11, find . Solutions: 1a. 1b. pgeom(5,0.15) 1c. dbinom(3,10,0.5 1d. 1e. Same as (d); see definition of Var(). 1f. mean(1/(1+dvals^2)) 2. rbinom(1,2,p) 3. dbinom(4,8,0.2) ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. Consider the coin/die game, p.83. [(a)] (15) Find . [(b)] (15) Find . 2. Consider the bus ridership example once again, in this case in Sec. 3.16. [(a)] (20) Find . [(b)] (15) Find . [(c)] (15) Find . (You may find that some of the computation has already been done for you in the text.) 3. (20) Below is a revised version of the bus ridership simulation on p.26. It computes the same quantity, but in a somewhat more efficient manner. Fill in the blanks. bussim <- function(nstops,nreps) b <- sample(0:2, ________ , // blank (a) replace=TRUE,prob=c(0.5,0.4,0.1)) b <- matrix(b,nrow=nreps) passeq0 <- vector(length=nreps) for (i in 1:nreps ) passengers <- 0 for (j in 1:nstops) if (passengers > 0) passengers <- passengers - ________ // blank (b) passengers <- passengers + ________ // blank (c) passeq0[i] <- ________ // blank (d) mean(passeq0) Solutions: 1a. M has a geometric distribution with p = 1/6, so from our section on that distribution. 1b. As noted in the example, give M = k, W has a binomial distribution with k trials and success probability 0.5. That distribution has variance , from our text section on that distribution. 2a. Ask the famous question, ``How can it happen?'' The only way is , which has probability . 2b. We are being asked for . Again, ``How can it happen?'' Here we must have , which has probability . 2c. Then from p.59, 3. m <- function(nstops,nreps) b <- sample(0:2,nreps*nstops, replace=TRUE,prob=c(0.5,0.4,0.1)) b <- matrix(b,nrow=nreps) passeq0 <- vector(length=nreps) for (i in 1:nreps ) passengers <- 0 for (j in 1:nstops) if (passengers > 0) passengers <- passengers - rbinom(1,passengers,0.2) passengers <- passengers + b[i,j] passeq0[i] <- passengers == 0 mean(passeq0) ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. Suppose for , with the value 0 elsewhere. [(a)] (15) Find . [(b)] (15) Find . 2. Consider the analysis of the ALOHA network, Section 4.5. (Keep in mind that that state numbers start at 0 but the row/column numbers in the R code start at 1.) [(a)] (20) In the R code on p.101, give a line (say, to be placed at the end) that will print out . [(b)] (20) Suppose there are three nodes in the network, rather than two, and that and are 0.4 and 0.3, respectively. Our states are now 0, 1, 2 and 3. Find 3. Consider bus ridership example, Section 4.6. [(a)] (15) Now that we have imposed a bus capacity of 20, we need to modify the simulation on p.20. State a line to add to the code to reflect this. Place the line position in a comment, e.g. count <- count/2 # insert after line 6 Do NOT change existing lines. [(b)] (15) Find . Solutions: 1a. integrate(function(t) 1/(2*t^2),1.5,2) 1b. integrate(function(t) 1/(2*t),1,2) 2a. (transmat 2b. The quantity is the probability that exactly 1 node tries to send, so the answer is dbinom(1,3,0.4) 3a. passengers <- min(passengers,20) # insert after line 12 3b. The possibilities are at least 1 person tries to board and no one alights 2 people try to board and 1 alights So the desired probability is (0.4 + 0.1) * dbinom(0,19,0.2) + 0.1 * dbinom(1,19,0.2) ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (10) In the phrasing of the article on Microsoft's new Azure software, the firm aims the app to machine learning. 2. (10) A trucking company transports many things, including furniture. Let be the proportion of a truckload that consists of furniture. For instance, if 15 of given truckload is furniture, then X = 0.15. We have data on , and will plot histograms and so on, in order to find a good model. Suggest a good parametric distribution family for modeling . 3. This problem concerns the material in Section 5.5.5, The random variables and are as in the bottom of p.122 and the top of p.124, respectively. In the case of , say the mean lifetime of ligh bulbs is 120 hours. If you need the function, R offers it as gamma(). Also, there are factorial() and exp(). [(a)] (10) State the value of . [(b)] (10) Find . [(c)] (15) Find . [(d)] (10) Concerning , we've asked someone to notify us when the eighth bulb burns out. At time 102.2 hours after the first bulb is installed, we still haven't heard from our notifier. Find the probability that at time 222.1, we still have not been notified. [(e)] (15) The text remarks that in Figure 5.2, the curve for r = 10.0 is already looking rather bell-shaped. By calling pnorm(), we can find the normal approximation to, say, the cdf corresponding to that density, evaluated at 14.2. What would be our third argument in that function? Say that figure used . 4. (10) Suppose for , 0 otherwise. Find the median of , i.e. the 0.5 quantile. 5. (10) Suppose and are independent, each having distribution N(0,1). Find ). Solutions: 1. democratize 2. beta distributions 3a. The gamma family has variance , so 3b. dgamma(88,5,0.01 3c. Use Property E: integrate(function(t) (1/t) * 1/(factorial(4)) * 0.01^5 * t^4 * exp(-0.01*t),0,Inf) 3d. Then use (1 - pnorm(222.1,5,0.01)) / (1 - pnorm(102.2,5,0.01)) 3e. So, we would use the standard deviation, . 4. We have , so set , yielding t = 2. 5. has a chi-square distribution with k = 1 degree of freedom. So, its variance is . The original expression has value 4. ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (10) On p.144, in an application of the Central Limit Theorem, it states ``The exact answer [to 3 decimal places] is 0.132.'' Write an R expression that would give us to that exact answer. 2. (10) Consider the network intrusion example, Section 6.2. (Suppose it really is Jill, no intrusion.) Find . 3. This problem concerns the toy population example, Section 10.1.2. [(a)] (10) Find . [(b)] (15) Find . [(c)] (15) Find . [(d)] (15) In this part only, suppose we draw a simple random sample of size 2. Find . 4. Suppose on (0,c), 0 elsewhere. Call this the ``t2 family.'' [(a)] (10) Write (on one line) pt2(x,c), for scalar x. [(b)] (15) Write (on one line) rt2(n,c), to generate n random variates. Solutions: 1. 1 - pbinom(12,20,0.5) 2. pnorm(535,500,15) - pnorm(510,500,15) 3a. 3b. 3c. 3d. 4a. function(x,c) (t/c)^3 4b. rtc <- function(n,c) c * runif(n)^(1/3) ","course":"ECS132"} {"quiz":"hyperref listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= GROUP QUIZ SUBMISSION INSTRUCTIONS: Your work must be submitted by 12 noon, March 12. Submission must be done from within the classroom. Submit your work on handin, to the directory 132quiz8. Your .tar file name must conform to the rules explained in our Syllabus, Section 19.4. Your .tar file must comprise three files, named Problem1a.R, FindEta.R and Problem 2.R, with contents as specified below. My grading script will be source(\"Problem1a.R\") whicheqn() # set p (not shown) source(\"FindEta.R\") findeta(p) source(\"Problem2.R\") # set c, n, nreps (not shown) cmp2ests(c,n,nreps) You are welcome to search the Web, though my saying this should not be construed to mean you necessarily would benefit from this. 1. Consider an -state Markov chain that is irreducible, meaning that it is possible to get from any state to any other state in some number of steps. Define the random variable to be the time needed to go from state to state . (Note that is NOT 0, though it can be 1 if .) where is the state traveled to immediately after leaving state . This then implies that We'll focus on the case , i.e. look at how long it takes to get to state . Let denote , and define . (Note that has only components!) So, In this problem you'll develop an R function to calculate . Here is a easy (though trivial) example of . Suppose the transition matrix of the chain is Then one can see right away without any computation that [(a)] Give the number of our textbook equation that most justifies (), among material prior to Chapter 4. Your answer will take the form of an R function whicheqn() that consists of a single print() call, e.g. print(\"(2,168)\") [(b)] Using (), write an R function with call form findeta(p) that inputs the Markov chain's transition matrix p and returns . Hints: Remember, in () the are the knowns, and the are the unknowns. Start with a very simple example, say (). 2. This problem concerns the raffle example in Section 13.1 of our book. We have two competing estimators, and you will write simulation code to compare them in terms of bias and mean absolute error, Your code will consist of a function with call form cmp2ests(c,n,nreps) and will return the (approximate) vector c(b1,b2,e1,e2). Here c and n are as in the raffle example (but are general, unrelated to the specific data in that example), and nreps is our usual number of ``notebook lines.'' Assume sampling without replacement, even though the theory behind is based on with-replacement sampling. Thus assume that . Solutions: 1a. (3.154) 1b. findeta <- function(p) n <- nrow(p) q <- diag(n) - p q <- q[1:(n-1),1:(n-1)] ones <- rep(1,n-1) solve(q,ones) 2. cmp2ests <- function(c,n,nreps) out <- matrix(nrow=nreps,ncol=2) for (i in 1:nreps) x <- sample(1:c,n,replace=FALSE) out[i,1] <- 2*mean(x) - 1 out[i,2] <- max(x) c( mean(out[,1] - c), mean(out[,2] - c), mean(abs(out[,1] - c)), mean(abs(out[,2] - c))) > cmp2ests(c,n,nreps) [1] -0.003373333 -0.613600000 1.744820000 0.613600000 ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (20) This problem concerns the ALOHA network model. Continue to assume that , , and that the network consists of just two nodes. Suppose . Find the probability that both nodes tried to send during epoch 1. 2. This problem concerns the bus ridership example, pp.20ff. [(a)] (20) Supply the reason for Equation (2.54), in the form of a ``mailing tube'' number, e.g. (2.5). (Write your answer in the form ``(2.x)'', NOT ``Equation (2.x)''. [(b)] (20) In the second-to-last bullet, p.20, we state the assumption that passengers make the decision to alight or not independently. In what equation, among (2.52)-(2.56), is that assumption used? [(c)] (20) An observer at the second stop notices that no one alights there, but it is dark and the observer couldn't see whether anyone was on the bus. Find the probability that there was one passenger on the bus at the time. 3. (20) We toss a coin until we get heads in a row. Let denote the number of tosses needed, so that for instance the pattern HTHHH gives for . Fill in the blanks in the following simulation code, which finds the approximate probability that . ngtm <- function(k,m,nreps) count <- 0 for (rep in 1:nreps) blank (a) for (i in 1: blank (b) ) toss <- sample(0:1,1) if (toss) consech <- consech + 1 if (consech == blank (c) ) break else consech <- 0 if (consech < k) count <- count + 1 return(count/ blank (d) ) Solutions: 1. 2.a (2.7) 2.b (2.55) 2.c Let denote the number of passengers alighting at stop i. 3. ngtm <- function(k,m,nreps) count <- 0 for (rep in 1:nreps) consech <- 0 for (i in 1:m) toss <- sample(0:1,1) if (toss) consech <- consech + 1 if (consech == k) break else consech <- 0 if (consech < k) count <- count + 1 return(count/nreps) ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (10) Suppose is the length of a random rod, in inches, and . Let denote the length in feet. Find . 2. (10) In the board game, Sec. 2.11, suppose we start at square 3 (no bonus, since we start there rather than landing there). Let denote the square we land on after one turn. Find . 3. This problem concerns the Monty Hall example, pp.40ff. [(a)] (15) Give the numbers of the ``mailing tubes'' in (3.1) and (3.2), respectively. Use a comma and/or spaces to separate the two equation numbers, e.g. ``(2.1) (2.3)''. [(b)] (15) Consider (3.1). Say we change the left-hand side to . What would be the new numerical value of the numerator on the right-hand side? 4. (20) Look at the simulation code on p.26. Say we wish to find the expected value of , where is the sum of the d dice. Give a line of code, to replace line 11. 5. Consider the Preferential Attachment Graph model, Sec. 2.13.1.. [(a)] (10) Give the number of the ``mailing tube'' justifying (2.69). [(b)] (10) Find . [(c)] (10) Find . Solutions: 1. 2. 3.a (2.8), (2.7) 3.b 4. mean(sums^2) 5.a (2.2) 5.b 5.c ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. This entire quiz concerns the committee example in Sec. 3.9.2, pp.61ff. Except for Problem 3(c), all answers are numeric. As usual, numeric answers must be given as R expressions that evaluate to numbers. Note that there are 5 problems, 1-5. 1. (10) Find . 2. (15) Find . For full credit, use an appropriate R function. 3. Consider the following simulation code: sim <- function(nreps) reprecords <- matrix(nrow=nreps,ncol=5) for (rep in 1:nreps) comm <- pickcommittee() reprecords[rep,1:4] <- comm tmp <- sum(comm) # find tmp-(4-tmp) reprecords[rep,5] <- 2*tmp-4 reprecords pickcommittee <- function() # choose the 4-person committee, recording # each time whether a man is picked npeopleleft <- 9 nmenleft <- 6 pickedsofar <- NULL for (i in 1:4) propmen <- nmenleft / npeopleleft manpicked <- sample(0:1,1,prob=c(1-propmen,propmen)) nmenleft <- nmenleft - manpicked npeopleleft <- npeopleleft - 1 pickedsofar <- c(pickedsofar,manpicked) pickedsofar We then run > simout <- sim(100000) We then print out some quantities, as seen below. [(a)] (15) What will be printed out from this? > mean(simout[,5]) [(b)] (15) What will be printed out from this? > mean(simout[,3]) [(c)] (20) What will be printed out from this? > rownums <- which(simout[,1] == 1) > sum(simout[rownums,2]) / length(rownums) Your answer here in Part (c) must be in ``P()'' form, using only symbols in the book, e.g. P(D = 9). 4. (15) Find . 5. (10) Find . Solutions: 1. 2. We need . It is choose(6,2) * choose(3,2) / choose(9,4) 3.a 3.b 3.c 4. is an indicator random variable, and thus its variance is , where . 5. We need to find The latter term is . To find , use reasoning similar to that on the top of p.63 to find that ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. When a problem says ``Find,'' do NOT use simulation. 1. Consider the coin and die game, Sec. 4.15.3. [(a)] (15) Find . [(b)] (15) Find . [(c)] (15) Find . 2. (15) Suppose some random variable has a Poisson distribution with . Do NOT use loops in this problem. [(a)] (15) Find . [(b)] (15) Find . (You'll need a mailing tube, but need not cite it.) 3. Consider the parking space example, Sec. 4.2.2. [(a)] (15) Change line 7 in the code so that instead of returning the approximate value of , it returns the approximate value of . [(b)] (10) (Not a continuation of part (a).) We have a caravan of four cars, and thus need four parking spaces. Let denote the distance of the furthest car from the destination. Find . Do NOT answer with a single R function call; instead, you must write an R expression that includes a call to choose(). Solutions: 1.a 1.b is geometric, so its variance is , where . 1.c 2.a ppois(8,3.2) 2.b 3.a mean(dvals <= 12) 3.b Number the spaces 1,2,...,10 in the first block, 11,12,...,20 in the second block and so on. means that the furthest car is in space 23. That in turn means that the fourth empty space was space 23. The probability of this is that of a negative binomial distribution with and . ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (20) For a certain random variable , What is the value of ? 2. (20) The matrix() function in R has an optional argument byrow. Here is an example: > matrix(c(5,0,2,0,0,6,8,88,168),ncol=3, byrow=TRUE) [,1] [,2] [,3] [1,] 5 0 2 [2,] 0 0 6 [3,] 8 88 168 Fill in the blank: This argument is used to conveniently input by rows, even though R uses storage. 3. Consider the 3-heads-in-a-row example of Markov chains, Sec. 5.4. [(a)] (20) Look at the code on p.106, and think of what happens during the call to findpi1(). The latter calls solve(). State the value of imp just before that call. [(b)] (10) Suppose spectators are watching this game, and every time the player tosses a tail after having two consecutive heads, the crowd moans, ``Oh no!'', since the player came so close to winning but did not win. After the player has tossed the coin 10000 times, approximately how many times will the crowd have said ``Oh no!''? 4. (15) Consider once again our Bus Ridership exmaple, with the constraint added in Sec. 5.8 that the bus has a capacity of 20 passengers. But let's change the distribution of the to be Poisson with = 0.1. Find . 5. (15) We wish to have a function that, for a given Markov transition matrix p and given pair of states r and s will find ) for any nonnegative integer . Fill in the blanks: > pkrs <- function(p,k,r,s) pwr <- p if (k > 1) for (i in 1:(k-1)) pwr <- # blank (a) # blank (b) Solutions: 1. The probabilities must sum to 1, so . 2. column-major 3.a 3.b 710 4. dbinom(0,2,0.2) * dpois(0,0.1) + dbinom(1,2,0.2) * dpois(1,0.1) + dbinom(2,2,0.1) * dpois(2,0.2) 5. > pkrs function(p,k,r,s) pwr <- p if (k > 1) for (i in 1:(k-1)) pwr <- pwr pwr[r,s] ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (15) On Section 6.4, find . 2. (15) On p.119, suppose is the number of heads obtained from three tosses of a coin, rather than two. Find . Write your answer only as a numerical expression, NO calls to R functions. 3. (15) Suppose for , 0 elsewhere, for some constant . Find . 4. (15) Consider the coin-and-die game, Section 4.15.3. You don't observe the game personally, but you hear that the player took at most 2 turns to roll a 5. Find the probability that the player wins exactly 1. 5. (15) The following simulation finds and returns the long-run average seek distance in the disk drive model, pp.126ff. Fill in the blanks: sim <- function(nreps) # start at the middle track, # but doesn't matter oldtracknum <- 0.5 seeks <- vector(length=nreps) for (i in 1:nreps) tracknum <- blank (a) seeks[i] <- blank (b) oldtracknum <- tracknum blank (c) 6. Consider the Markov inventory model, p.112, and the following run of the code: > inventory(0.8,0.2,5) [1] 0.1936083 0.1932367 0.1950948 0.1858045 0.2322557 [(a)] (15) Find the proportion of days in which a customer leaves emptyhanded. [(b)] (10) Find the proportion of customers who leave emptyhanded. Solutions: 1. 2. pbinom(1,3,0.5) 3. The density must integrate to 1. Solving for yields the value 3/8. 4. The denominator is and the numerator is 5. sim <- function(nreps) oldtracknum <- 0.5 seeks <- vector(length=nreps) for (i in 1:nreps) tracknum <- runif(1) seeks[i] <- abs(tracknum - oldtracknum) oldtracknum <- tracknum mean(seeks) 6.a 6.b Think of what will happen over the course of 10000 days. We will have approximately 12000 customers, among whom will leave emptyhanded. Then divide. ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (15) Suppose and are independent and have Poisson distributions, then it can be shown that also has a Poisson distribution. Fill the blank with a term from our course: We say that the Poisson family is under independent summation. 2. Consider the class enrollment example, p.153. [(a)] (15) Give R code to evaluate Equation (7.24). [(b)] (15) Give R code to find the upper 10 point for class size, i.e. a number above which only 10 of class exceed. 3. Consider the toy population example, Sec. 9.2.1. Suppose we take a simple random sample of size 2. Imagine a notebook description of this, with columns labeled , and , and infinitely many lines. [(a)] (15) What is the number of distinct values in the column? [(b)] (10) What is the long-run proportion of rows in which there is a 72 in the column and a 69 in the column? [(c)] (15) What is the long-run proportion of the value 72 in the column? 4. (15) A dart is thrown at the interval (0,1). The position that it hits is a random variable, with density for and 0 elsewhere. Find the expected value of the distance from the dart to the point 0.5. Solutions: 1. closed 2.a (1 - pnorm(30,28.8,3.1)) / (1 - pnorm(25,28.8,3.1)) 2.b qnorm(0.90,28.8,3.1) 3.a 3 3.b 1/6 3.c 1/3 4. ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= Name: Directions: MAKE SURE TO COPY YOUR ANSWERS TO A SEPARATE SHEET FOR SENDING ME AN ELECTRONIC COPY LATER. 1. (20) The courtroom analogy used in our text describing the philosophy underlying significance testing is 2. (15) Consider the beta distribution family, Sec. 6.6.5. It has two parameters, and . In fitting such a model to our data, we would come up with estimates of these two parameters, and . Fill in the blank with a term from our course: The standard deviations of and are called their 3. (15) Suppose we wish to construct an (approximate) 80 confidence interval. What number should we use instead of 1.96? Your answer must consist of an R call. 4. (20) For various distribution families, R provides the functions `d', `p', `q' and `r'. Give nonsimulation R code that computes (11.9), using an appropriate one of these functions. For full credit, your code should not use loops. 5. Let denote the weight of some kind of item. Unknown to us, in the populaiton for in (0,1), 0 elsewhere. [(a)] (15) Find the population value . [(b)] (15) We take a random sample , and calculate . Find the exact value of . An expression of the form MUST appear in your electronic answer. Solutions: 1. innocent until proven guilty 2. standard errors 3 qnorm(0.90) 4 1 - pbinom(7,10,0.5) 5.a 5.b . So compute and = 1/6. Our answer is then . ","course":"ECS132"} {"quiz":"listings amsmath xleftmargin=5mm,framexleftmargin=10mm,basicstyle= DIRECTIONS: Write your solutions in a single .tex file, including R code. Your .tar package will consist of that file, its output .pdf file, and a separate file for each problem requiring R code, each such file named in the form x.R, where x is the problem number. Name your .tar file as you did in the homework. Your submission must be in the 132groupquiz directory in handin; it must be timestamped on or before 11:50 a.m. NO LATE SUBMISSIONS; keep submitting the work you have, as you go along, so that you at least have something turned in. You are not necessarily expected to solve all the problems. 1. (30) This problem involves the built-in dataset ToothGrowth in R. Type > ToothGroth # print data frame > ?ToothGroth # learn more about it to get acquainted. Fit the model where is a dummy variable for OJ. Do NOT include your code in a .R file, but DO show ALL your code, including data set up, and the output in your .tex file. Interpret the results. 2. (25) I flip a coin twice, getting heads. I then flip it more times, with heads among these flips. Define . Find , with a clean, clear derivtion. 3. (25) Write a function with ``declaration'' pval <- function(x,p0) that will return the approximate p-value for a significance test, as follows. Here x is a vector of 0s and 1s, where for instance 1 could mean, Yes, plan to vote for Smith, with 0 meaning No, won't vote for her. We will test the hypothesis where is the true (but unknown) population proportion of 1s. The alternate hypothesis is Example: > set.seed(168) > x <- sample(0:1,500,prob=c(0.2,0.8), replace=TRUE) > pval(x,0.75) [1] 0.009823275 4. (20) Write a function with ``declaration'' qpqinv <- function(lmout) where lmout is an object of class \"lm\" (i.e. the value returned from a call to lm()), which will return the matrix . Hint: Here you'll need to explore an actual R \"lm\" object. Solutions: 1.a > summary(lm(len .,data=tg)) Call: lm(formula = len ., data = tg) Residuals: Min 1Q Median 3Q Max -6.600 -3.700 0.373 2.116 8.800 Coefficients: Estimate Std. Error t value Pr(>t) (Intercept) 5.5725 1.2824 4.345 5.79e-05 *** supp 3.7000 1.0936 3.383 0.0013 ** dose 9.7636 0.8768 11.135 6.31e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 4.236 on 57 degrees of freedom Multiple R-squared: 0.7038, Adjusted R-squared: 0.6934 F-statistic: 67.72 on 2 and 57 DF, p-value: 8.716e-16 Each increase of one unit of the supplement is estimated to result in an increase of 3.7 in mean tooth length. Orange juice has an estimated impact of 9.76. Statements should NOT be made along the lines of ``Dose is more significant than OJ.'' 2. 3. pval <- function(x,p0) estp <- mean(x) n <- length(x) z <- (estp - p0) / sqrt(p0 * (1-p0) / n) 2 * (1 - pnorm(abs(z))) 4. Lots of ways to do this, such as qpqinv <- function(lmout) n <- length(lmout rank s2 <- sum(lmout ","course":"ECS132"}