0% found this document useful (0 votes)
34 views26 pages

N.P. Singh Professor

The document discusses using data mining techniques to develop an audit selection strategy for improving tax compliance. It aims to identify tax returns that have a high likelihood of underreporting taxes using models built on historical taxpayer data. The key is finding an optimal balance between accurately detecting evaders and minimizing audit costs and examination efforts. Various data mining algorithms could be used to classify taxpayers as evading or not evading taxes based on profile and return characteristics. The model would help allocate limited audit resources more efficiently.

Uploaded by

g
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views26 pages

N.P. Singh Professor

The document discusses using data mining techniques to develop an audit selection strategy for improving tax compliance. It aims to identify tax returns that have a high likelihood of underreporting taxes using models built on historical taxpayer data. The key is finding an optimal balance between accurately detecting evaders and minimizing audit costs and examination efforts. Various data mining algorithms could be used to classify taxpayers as evading or not evading taxes based on profile and return characteristics. The model would help allocate limited audit resources more efficiently.

Uploaded by

g
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

N.P.

Singh Professor

Examples Audit Selection Strategy for

Improving Tax Compliance Data Mining Techniques

Audit Selection Strategy for Improving Tax Compliance The tax administration is required to audit some or all its taxpayers to check the evasion of tax and ensure compliance. Conducting of audits involves costs to the tax department as well as to the taxpayer. Thus, audit is not a very welcome procedure both for the taxpayers as well as the economy. Tax administration agencies must therefore use their limited resources very judiciously to achieve maximal taxpayer compliance, minimum intrusion and minimum costs. Use of data mining algorithms as the best cost effective option to make audit selection more efficient and effective.

Develop a model that can identify dealers/ returns that have maximum likelihood of tax under-reporting in the large volumes of tax returns filed by the dealers in VAT system as well as to minimize the examination effort and costs, so that the scarce audit resources can be effectively deployed by the tax department. These two objectives are contrary to each other, as enhancing the detection would certainly mean the increase in examination effort. Therefore an optimal trade-off has to be achieved by the model to suit the end objective. It will be a predictive model, as it would predict the likelihood of a dealer under-declaring / evading tax in the return. If utilized, the Department would be able to allocate its limited resources for more productive and specific purposes.

With the increasing number of a fixed number of taxpayers case are audited on random basis. Randomness may include all cases who are underreporting Randomness may include all cases where exact data is reported. Equal treatment to honest & dishonest tax payers

Data mining, the dealers are divided into two groups on the basis of prior information available. The first group in the data mining exercise is the sample of dealers found to be evading/ under-declaring tax. Delhi VAT had approximately 1,80,000 dealers registered with the department. Criteria for Selection:

List is large. It is impossible to know about the complete set of

dealers who had evaded or underreported the tax. It was decided to select the dealers who had been assessed additional tax and where such assessed tax was sustained in the first appeal. The year was 2003-04. Manual collection of data (from the files / registers of the Appellate Authorities). 402 dealers from whom, the recoveries of year 2003-04 were of Rs 50,000/- or more.

The second sample required is of good dealers, which, with reasonable level of confidence, can be said to have been correctly reporting the tax. Criteria
Businesses that pay high tax were assumed to be ones that

correctly declare their taxes. However, in identifying the sample, sufficient precautions were required to ensure it is sufficiently representative. For this purpose, (major) commodity and the nature of business (retailer / whole seller/ exporter etc.) of the 402 dealers were identified. Thereafter, for each of the dealer in the evading category, three highest tax paying dealers dealing in same commodity as well as having same nature of business were identified.

The tentative list of good dealers so generated

was manually updated from the respective wards to examine if there is any adverse material, based on surveys, raids etc. from official records. Such dealers, if found, were deleted from the list. The said procedure generated approx 1200 dealers. Three times the number of evading dealers has been taken so as to strike a balance between representativeness of the data-set and skewness of the data set.

Two Points Skewness: It is proved again & again that any result derived out of non-normal data are not very accurate. High Tax Payers: It is also reported in the literature that high tax payers are good evader of the tax.

A. Dealer Profile
A1. New Registrant (Y/N) A2. Deals in high Tax rate items (Y/N) A3. Any other business operating from same

Address? (Y/N) A4. Any other business having same Tel No? (Y/N)

B. Return Compliance
B1. Any Return default (Non filing)? (Y/N) B2. Delay in filing returns? (No of days) B3. No of returns that are NIL return ?

C. Returned Values & Ratios


C1. Tax : Turnover C2. Gross profit % C3. Exempt Sales : Turnover

C4. Inventory : Turnover


C5. Purchases : Sales C6. Refund Claimed > Rs. 1000? (Y/N)

C7. Tax credit carry forward > Rs 1000? (Y/N)


C8. Output Tax or Input Tax Credit Adjustments >

1000? (Y/N)

Variations in the returns across tax-periods.


D1. Turnover growth (compared to last year) D2. Tax Growth (compared to last year) D3. Variance of Turnover across tax-periods D4. Changes/variation in Sales mix (local/ interstate/ export) D5. Changes/variation in Purchase mix (local/ interstate/ import) D6. Changes/variation in product mix for sales (exempt, taxable at various rates) D7. Changes/variation in product mix for purchases (exempt, taxable at various rates)

E. Benchmarking vis--vis dealers of similar trade/industry, in respect of following parameters E1. Tax: Turnover E2. Gross profit % E3. Taxable sales: Turnover E4. Inventory: Turnover E5. Turnover growth E6. Tax growth

The list of the input variables depends


Availability of data If the data for a variable is full of missing values it should not be considered for model building Data inconsistency : At the time of integration of

data in to data warehouse these consistencies are removed. But if data not integrated in DW it will be full of inconsistency. High programming & computer resource available for the purpose of computation at the time of model building.

The target variable in this exercise is the fact whether the dealer has under-reported tax or not. There are two options for defining the target variable. It can be
(a) the amount of tax under-reporting detected, (b) the fact whether under-reporting has been detected

(yes/no).

Under the Sales Tax/VAT system, 96-97% of revenues come with the returns, and the Audits contribute only 3-4% of revenues directly. It is the fear of penalties in case of Audits that fetches the 96% of the revenues with the returns.

The objective of the Audit strategy is not to maximize the revenues out of the audits, but to maximize the strike rate in Audits, so that fear of god is put in the tax-evaders. The Literature review suggests that in the IRS system, (b) as the target variable has yielded better results in the US. Therefore, target variable, TARGET was set at 1 for dealers detected to have been undeclared tax. It is set to 0 for dealers that are high taxpayers of the category, and nothing adverse has been detected against them.

It is a classification problem Possible models that can be used Decision Tree:


(i) ID3, (ii) CART, (iii) C4.5, (iv) C5.0 , (v) CHAID,

Logit Regression Neural Networks.

Business Understanding Phase Data Understanding Phase Data Preparation Phase Model Building Phase/ Model Selection Phase Model Evaluation Phase Model Deployment Phase

Any Audit Selection Strategy cannot hope to, and cannot practically catch all the tax-evaders. However, it is immaterial whether it is able to catch all the under-declarers/ evaders. The tax administration would like to deploy its resources where it is able to get high strike-rate, so that the fear of audits may automatically increase the tax revenues with the returns. In the process no harm would be done even if some guanine returns are selected for audit.

Following indices are defined in line with the desired objectives of the Audit Strategy.
Prediction efficiency (PE) = percentage of tax-evasions cases

correctly predicted by the model.


PE = TP / (TP + FN)

Reduction in examination effort (EF) = Reduction in the

number of cases to be examined vis--vis the traditional method where all cases were being examined.
EF = 1 - (FP + TP)/ Total cases.

Strike Rate (SR) = percentage of cases where evasion is likely

to be detected if predicted cases are taken for audit.


SR = TP / (TP + FP)

the most important parameter for model evaluation is Strike Rate (SR). However the prediction efficiency (PE) cannot be ignored either, because if the prediction efficiency of the model is low, the model itself is useless, and one can rather resort to random selection.

PE & SR

ER & COMPARISON

[Link] [Link] [Link] [Link] [Link] [Link] [Link]

You might also like