RegularExpression
Question 1 
(a) An identifier in a programming language consists of upto six letters and digits of which the first character must be a letter. Derive a regular expression for the identifier.
(b) Build an LL(1) parsing table for the language defined by the LL(1) grammar with productions
Program → begin d semi X end X → d semi X  sY Y → semi s Y  ε
Theory Explanation. 
Question 2 
Which one of the following regular expressions represents the set of all binary strings with an odd number of 1’s?
10*(0*10*10*)*
 
((0 + 1)*1(0 + 1)*1)*10*
 
(0*10*10*)*10*  
(0*10*10*)*0*1

The regular expression ((0+1)*1(0+1)*1)*10* generate string “11110” which is not having odd number of 1’s , hence wrong option.
The regular expression (0*10*10*)10* is not a generating string “01”. Hence this is also wrong . It seems none of them is correct.
NOTE: Option 3 is most appropriate option as it generates the max number of strings with odd 1’s.
But option 3 is not generating odd strings. So, still it is not completely correct.
The regular expression (0*10*10*)*0*1 always generates all string ends with “1” and thus does not generate string “01110” hence wrong option.
Question 3 
(0+1 (01*0)*1)*  
(0+11+10(1+00)*01)*  
(0*(1(01*0)*1)*)*  
(0+11+11(1+00)*00)* 
Transition table
0 
1 

>*q0 
q0 
q1 
q1 
q2 
q0 
q2 
q1 
q2 
DFA of divisible by 3
The regular expression will be
Three paths to reach to final state from initial state
Path 1: self loop of 0 on q0
Path 2: q0>q1>q0 hence 11
Path 3: q0>q1>q2>q1>q0 hence 10(1+00)*01
So finally the regular expression will be
(0+11+10(1+00)*01)*
Other regular expression is (if we consider two paths)
Path 1: Path 1: self loop of 0 on q0
Path 2: q0>q1>q2*>q1>q0
Hence regular expression
(0+1 (01*0)*1)*
Another regular expression is (if we consider only one path and remaining part is present inside this path as cycles)
q0>q1>q2>q1>q0
Another regular expression is
(0*(1(01*0)*1)*)*
So A,B, C are correct.
Option D generates string 11100 which is not accepted by FA hence wrong.
Question 4 
ab * bab * ba * aba *  
( ab * b ) * ab * ( ba * a ) * ba *  
( ab * b ba * a ) * ( a * b * )  
( ba * a ab * b ) * ( ab * ba * ) 
(ab*b)*ab*+(ba*a)*ba* does not generate the string “abbaa” hence this is also wrong.
(ab*b+ba*a)*(a*+b*) generates epsilon hence this is also wrong.
(ab*b+ba*a)*(ab*+ba*) is the correct regular expression.
GATE 2022 Computer Science and Information Technology (CS)
Question 5 
(L + D)*  
(L.D)*  
L(L + D)*  
L(L.D)* 
Question 6 
Set of strings with 2 a’s and 2 b’s  
Set of strings with 2 a’s 2 b’s followed by b  
Set of strings with 2 a’s followed by b’s which is a multiple of 3  
Set of strings with even number of a’s followed by odd number of b’s 
→ Because (aa)* implies even number of a's and (bb)*b implies even number of b's followed by one b , it implies the odd number of b's.
Question 7 
(0*10*1)*  
0*(10*10*)*  
0*(10*1*)*0*  
0*1(10*1)10* 
→ We will guess strings which has even number of 1’s and that is not generated by wrong options OR which generate strings which doesn’t have even number of 1’s.
Option A: (Regular expression: (0*10*1)* ) doesn’t generate string such as { 110, 1100,....}
Option C: (Regular expression: 0*(10*1*)*0* generate string such as {1, 111,....} which have odd number of 1’s.
Option D: (Regular expression: 0*1(10*1)*10* doesn’t generate strings such as { 11101, 1111101, ….}.
Question 8 
aba  
bab  
abba  
aabb 
Question 9 
{a,b,ab,aa}  
{a,b,ba,bb}  
{a,b}  
{aa,ab,ba,bb} 
● (ab) means either a or b.
● The expression consists of two symbols of (ab),which strings starting with either a or b and ending with either a or b.
Question 10 
a*b*  
(a  b)*  
(ab) *  
(a  b*) 
Question 11 
(a+b)  
(a+b)(a+b)*  
(a+b)(a+b)  
all of these 
● The option C gives aa,ab,ba or bb which are of length of 2
● But the CFG will produce strings, which of length of greater than 2 also.
● The option B gives string starts with either a or b followed by any number of a’s or b’s which is equivalent to CFG.
Question 12 
Let r = a(a + b)*, s = aa*b and t = a*b be three regular expressions.
 Consider the following:
(i) L(s) ⊆ L(r) and L(s) ⊆ L(t)
(ii). L(r) ⊆ L(s) and L(s) ⊆ L(t)
Choose the correct answer from the code given below :
Code :Only (i) is correct  
Both (i) and (ii) are correct
 
Only (ii) is correct
 
Neither (i) nor (ii) is correct

L(r) = { ab, aab, aaab, aaaab, ………… }
L(t) = { b, ab, aab, aaab, aaaab, ……… }
L(s) can generate every possible string over a, b but L(r) and L(t) can’t generate every possible string over a, b.
So L(s) ⊆ L(r) and L(t) ⊆ L(r).
L(r) can’t generate “b” but L(t) can’t.
So L(s) ⊆ L(t).
Question 13 
Strings not starting with 11  
Strings of odd length  
Strings starting with 00  
Strings of even length 
We can’t generate strings with only 00, we can generate strings starting with 00, 01 and 10 also, So option C is incorrect.
From the given expression (00+01+10)(0+1)*, we can’t generate strings with 11 , we can generate strings starting with 00, 01 and 10.
Question 14 
(0+0+1)(0+0+1)  
(0+0+0)(0+1+1)  
(1+0+0)(1+1+1)  
(0+1+0)(1+0+1) 
→ (0+0+0) gives the value 0
→ (0+1+1) gives either 0 or 1.
→ (0+0+0)(0+1+1) gives ), as product of 0 with anything is 0
Question 15 
If r and s are regular expression representing the languages L(r) and L(s), respectively, then which of the following is FALSE?
(r)(s) is a regular expression representing L(r)UL(s)
 
(r)(s) is a regular expression representing (L(r))* or (L(s))*  
(r)* is a regular expression representing (L(r))*  
(r)(s) is regular expression representing L(r)L(s)

1. Union : (r)(s) is a regular expression denoting L(r) U L(s)
2. Concatenation : (r)(s) is a regular expression denoting L(r)L(s)
3. Kleene closure : (r)* is a regular expression denoting (L(r))*
(r) is a regular expression denoting L(r).
Question 16 
0*1 ^{+} 0*1*  
0*1*0 ^{+} 1*  
0*1*0*1 ^{+}  
0^{+}1*0*1* 
Question 17 
I. p443y
II. p6y
III. 3xyz
IV. p35z
V. p353535x
VI. ppp5
I, III and VI only  
IV, V and VI only  
II, IV and V only  
I, IV and V only 
Given regular expression is p+[35]*[xyz]
It means we have to start with p.
[35]* means (3+4+5)* It can generate these numbers many number of times.
[xyz] → ends with either x or y or z or any combination but not violate order of occurrences
I. p443y → matched because it satisfied above regular expression.
Il. p6y → not matched because 6 not present in regular expression.
III. 3xyz → not matched because it not starts with p.
IV. p35z → matched because it satisfied above regular expression.
V. p353535x → matched because it satisfied above regular expression.
Vl. ppp5 → not matched because it is not end with x or y or z or any combination.
Question 18 
08  
09  
10  
12 
Hence (0^{+}1^{+}  2^{+}3^{+})* = (0^{+}1^{+} + 2^{+}3^{+})* = (a+b)*
Now since the min string in 0^{+}1^{+} is "01" hence "a" has min length two.
Similarly the min string in 2^{+}3^{+} is "23" , hence "b"has min length two.
We require 4 length strings. ^{+} We can generate 4 length strings using "a" only, or using "b" only or using both.^{+} Using "a" one time only → 0^{+}1^{+} only: {0001, 0011, 0111}
using "b" one time only → 2^{+}3^{+} only : {2223, 2233, 2333}
Using two time, i.e., aa, ab, ba, bb
aa→ 0101
ab→ 0123
ba→ 2301
bb→ 2323
'Hence total 10 strings {0001, 0011, 0111, 2223, 2233, 2333, 0101, 0123, 2301, 2323}
Question 19 
08  
10  
11  
12 
1111, 0111, 1112, 0122, 0123, 1222, 1223, 0122, 0113, 1123, 0112, 1113.
So, total 12 strings of length 4 are possible.
Question 20 
{a}  
{ε, a, b}  
{a, b}  
None of these 
Question 21 
S →AB/AS, A→a/aA, B→b
aa*b +  
aa*b  
(ab)*  
a(ab)* 
It is clearly representing
aa*b
Question 22 
(i) L(s) ⊆ L(r) and L(s) ⊆ L(t)
(ii). L(r) ⊆ L(s) and L(s) ⊆ L(t)
Only (i) is correct  
Both (i) and (ii) are correct  
Only (ii) is correct  
Neither (i) nor (ii) is correct 
L(r) = { ab, aab, aaab, aaaab, ............}
L(t) = { b, ab, aab, aaab, aaaab, .........}
L(s) can generate every possible string over a, b but L(r) and L(t) can’t generate every possible
string over a, b. So L(s)⊆L(r) and L(t)⊆L(r)
L(r) can’t generate “b” but L(t) can’t. So L(s)⊆L(t).
Question 23 
(1 + 010)*  
(01 + 10)*  
(1 + 010)* (0 + λ)  
(1 + 01)* (0 + λ) 
Option A is incorrect because it can generate 010010 string which is having a pair of consecutive zeros.
Option B is incorrect because it can generate 1001 string which is having a pair of consecutive zeros.
Option C is incorrect because it can generate 0100 string which is having a pair of consecutive zeros.
Option D is correct because it will generate all the strings mentioned in the given language L
Question 24 
a*b  
a*baa*  
a*ba*  
None of the above 
L1={ba, baa, baaa, .........................aba, aaba, aaaba,................abaa, abaaa, ....................}
L2={a, ab, abb, abbb, ..........................}
Take "a" from L1:
L1/a ={ b, ba, baa, .................ab, aab, aaab, ....................aba, abaa, .....................}
So we get : a*ba*
If you take "ab" from L2
Like:
L1/ab = you will not get any string i,e. empty set will be in result in case of L1/ab
as no string in L1 end with "ab"
Question 25 
(i) a
(ii) aba
(iii) abaabaaba
(iv) abaabaabaabaabaabaabaabaaba
Using this as an information, how would you compare the following regular expressions ?
(i) (aba)^{3x }
(ii) a.(baa)3^{x}–1. ba
(iii) ab.(aab).^{3x}–1.a
(ii) and (iii) are same, (i) is different.  
(ii) and (iii) are not same.  
(i), (ii) and (iii) are different.  
(i), (ii) and (iii) are same.

If you observe all of them is generating (aba)x
Note: This question is ambiguous, the wordings are not very clear.
Question 26 
R_{1} : U → IU
R_{2} : M.x → M.x.x where . .. is string concatenation operator. Given this, which of the following holds for
(i) MIUIUIUIUIU
(ii) MIUIUIUIUIUIUIUIU
Either (i) or (ii) but not both of these are valid words.  
Both (i) and (ii) are valid words and they take identical number of transformations for the production.  
Both (i) and (ii) are valid words but they involve different number of transformations in the production.  
None of these 
It has M (5 times ) IU
So start with MIU
Production 2 says: M.x> M.x.x
So
MU> MUU again using it
MUU> MUUU > MUUUU >MUUUUU
Now using prod 1: Last U> IU
MUUUUU > MIUUUUU
MIUUUUU > MIUIUUUU
MIUIUUU > MIUIUIUIUIU
Question 27 
a ∗ b ∗  
(a ∗ b + b) ∗  
(a + b ∗ ) ∗  
(a ∗ b) ∗ 
Question 28 
Set of all string not containing ‘11’  
Set of all string not containing ‘00’  
Set of all string containing ‘01’  
Set of all string ending in ‘0’ 
According to the language L generated by given regular expression
Option(A) is incorrect because L contains 110.
Option(B) is correct because no string is containing 00
Option(C) is incorrect because not all strings in L are containing 01 example 0 in L does not contain 01.
Option(D) is incorrect because L contains 1λ is there in L which is not ending with 0.
Hence correct answer is option (B)
Question 29 
L(e) is empty.  
L(e) is finite.  
Complement of L(e) is empty.  
Both L(e) and its complement are infinite. 
Question 30 
(λ + a + aa + aaa) b* + a* bbbb* + (a + b)* ba(a + b)*  
(λ + a + aa + aaa) b* + a* bbbbb* + (a + b)* ab(a + b)*  
(λ + a + aa + aaa) + a* bbbbb* + (a + b)* ab(a + b)*
 
(λ + a + aa + aaa)b* + a* bbbbb* + (a + b)* ba(a + b)* 
Question 31 
(a) (r + s)* = (s + r)*
(b) (r*)* = r*
(c) (r* s*)* = (r + s)*
Which of the above identities are true?
(a) and (b) only  
(b) and (c) only  
(c) and (a) only  
(a), (b) and (c) 
∅* = ε
ε* = ε
RR* = R*R
R*R* = R*
(R*)* = R*
RR* = R*R
(PQ)*P =P(QP)*
(r+s)* = (r*s*)* = (r*+s*)* = r*(sr*)*
Hence option(D) is correct.
Question 32 
regular expression is a*b(a+b)
1  
2  
3  
4 
Please note: it is asked about minimum finite automata (so its about min NFA and not min DFA)
Question 33 
(a + b)^{∗}aba  
aba(a + b)^{∗}aba  
(a + b)aba(b + a)^{∗}  
aba(a + b)^{∗}  
(ab)^{∗}aba 
Question 34 
(1 + 01) * (10 + 01)  
(1 + 01) * 01  
(1 + 01) * (1 + 01)  
(10 + 01) * 01 
Option(A) is not correct because it can generate “10” which is not ending with “1”.
Option(B): It can’t generate “1” which is a string belongs to given language L. So it is not a correct option.
Option(C) : It is the correct option because it can generate all the strings generated by a given language.
Option(D): It is not the correct answer because it can’t generate strings like 1,11,........
Question 35 
r= a*b*c  
r= aa*bb*c  
r= aa*b*cc  
r= aa*bb*cc* 
→Substitute in r=s=t=1 in w then L= abc
→So option D gives the regular expression abc
Question 36 
S → Ax By A → ByCw B → xBw C → yWhich of the regular expressions describes the same set of strings as the grammar?
xw+y+xw*yx+ywx  
xwy+xw*xy*ywx
 
xw*y+xw*yx+ywx  
xwxy+xww*yywx 
Option B do not contain xy which is generated by the grammar.
Option C do not contain xy which is generated by the grammar.
Question 37 
(a∪b)*  
(b*∪a*)*  
(b∪a)*  
(a∪b) 
Question 38 
(a + b)* ba(a + b)*  
a* bbbbb*  
None of the above 
Option (A) is incorrect because the smallest string it can generate is “ba” which does not belong to L.
Option (B) is incorrect because the smallest string it can generate is “bbbb” which does not belong to L.
Option (C) is incorrect because the smallest string it can generate is “λ” which does not belong to L.
Regular expression for given language L is aaaaa*(λ +b+bb+bbb)
Hence option(D) is the correct answer.
Question 39 
https://solutionsadda.in/wpcontent/uploads/2020/04/tifr7.jpg
Which of the following regular expressions corresponds to the language accepted by this automaton?
(a + b)^{∗}aba  
(a + b)^{∗}ba^{∗}  
(a + b)^{∗}ba(aa)^{∗}  
(a + b)^{∗}  
(a + b)^{∗}baa^{∗} 
Question 40 
(10^{∗}10^{∗})^{∗}  
(10^{∗}10^{∗}1)^{∗}  
0^{∗}1(10^{∗}1)^{∗}10^{∗}  
0^{∗}1(0^{∗}10^{∗}10^{∗})^{∗}10^{∗}  
(0^{∗}10^{∗}1)^{∗}0^{∗} 
Question 41 
(a + b)* = a*b*  
a(b + c) = ab + c  
(a + b)* = a* + b*  
(ab + a)*a = a(ba + a)*  
None of the above 
Question 42 
00  
10  
010  
011 
Question 43 
00  
11  
100  
1001 
Question 44 
a*ba*ba*  
a*bba*  
bba*  
a*bb 
Option C is false because string abab will not be generated by the given regular expression.
Option D is false because string abab will not be generated by the given regular expression.
Option A can generate all the strings with exactly two b’s.
Question 45 
a*(c + d) + b(c + d)  
a*(c + d) + b(c + d)  
(a* + b)c + (a* +b)d  
(a+b)*c + (a + b)*d 
Given (a+b)*c + (a + b)*d
Now take (a+b)* out as common ,then we will get (a+b)*(c+d).and we know that
(a*+b)* = (a+b)*.So we can also write (a+b)*(c+d) as (a* + b)*(c + d) .
Question 46 
a*(c * d) + b(c + d)  
a*(c * d) + b*(c + d)  
(a + b)*c + (a + b)*d  
(a* + b)c + (a* + b)d 
Option B cant generate string ‘abc’ which can be generated by the regular expression given in question.
Option D cant generate string ‘abbc’ which can be generated by the regular expression given in the question.
Option C is correct answer because
(a + b)*c + (a + b)*d = (a+b)* (c+d) = (a*+b)*(c+d)
Question 47 
L(r)⊆L(s)⊆L(t)  
L(r)⊇L(s)⊇L(t)  
L(r)⊇L(t)⊇L(s)  
L(s)⊇L(t)⊇L(r) 
s={aa,ab,aaa,aab,aba,aaab,......}
t={ab,aab,aaab,....}
If you notice then r⊇s⊇t
Question 48 
L_{1} ⊂ L_{2}  
L_{2} ⊂ L_{1}  
L_{1} = L_{2}  
L_{1} ∩ L_{2} = ɸ 
(a+b)* = (a* +b)*= (a+b*)=(a*b*)*= (a* +b*)*= a*(ba*)*=b*(ab*)*
Question 49 
r1 = (a + b)^{+} a (a + b)*
r2 = (a + b)*b (a + b)^{+}
Let L1 and L2 be the languages defined by r1 and r2, respectively. Which of the following regular expressions define L1 ∩ L2?
(a + b)^{+} a (a + b)*b (a + b)^{+}  
(a + b)* a b (a + b)*  
(a + b)*b (a + b)* a (a + b)*  
(a + b)* a (a + b)*b (a + b)* 
Option (a) is incorrect since it accepts only words of length at least 4.
Option (b) is incorrect since it accepts ab, which is not in L1 ∩ L2.
Option (d) is incorrect once again as it accepts ab.
Question 50 
((11)0*)*  
1(01*)*  
(10*)*  
(11 + 101*01 + 0)* 
(a) It can be checked that L(((11)0*)*) does not contain the string 1001, which represents 9.
(b) L((10*)*) contains 1, which represents a nonmultiple of 3.
(c) L(1(01*)*) contains 101 (representing 5), as also 1 (representing 1).
(d) This is the only remaining option. One can prove it by first constructing a DFA for binary strings that are multiples of 3 and converting it to a regular expression.
Question 51 
Only (i) is false  
Only (ii) is false  
Only (iii) is false  
Both (i) and (iii) are false  
None of the above 